| Indicator ID | NO_POPO_001 |
| Indicator Name | Pollinator potential |
| Country | Norway |
| Continent | Europe |
| Ecosystem Condition Typology Class | B1 - Compositional State Characteristics |
| Realm | Terrestrial (T) |
| Biome | T4 Savannas and grasslands biome |
| Ecosystem | T4.5 Temperate grasslands |
| Year added | 2024 |
| Last update | 2024 |
| status | incomplete |
| Version | 000.001 |
| Version comment | First draft |
| url | https://github.com/NINAnor/ecRxiv/tree/main/indicators/NO_POPO |
Show metadata
Pollinator potential
1. Introduction
We aim to model the ecosystem conditions in Norway given their known interactions with some plant species. We develop pollinator indicators which are used to describe ecosystem conditions by focusing on the ASO data meadows, which represent the semi-natural habitats we are interested in describing. To achieve this objective, we do the following:
- obtain data from the Global Biodiversity Infrastructure Facility (GBIF ) for bumblebees, butterflies and bees (which we will refer to as pollinators in this study) and plant hosts (only the plant species we have interaction data of). We also obtain covariates that will be needed for the model fit Table 8.
- fit an integrated species distribution model (ISDM) to the data and predict the intensities of each pollinator (average abundance of each pollinator within a grid cell) of across the
alpine,borealandtemperatezones [REF to the table with zones]. Cluster all the pollinator species (over \(8000\) of them) either as alpine, boreal or temperate pollinators and assign each set of species the predicted intensities estimated. Also, fit an ISDM to the plant host data and predict their occurrence probability across the study region.
- fit an integrated species distribution model (ISDM) to the data and predict the intensities of each pollinator (average abundance of each pollinator within a grid cell) of across the
- obtain a web plot interaction matrix of the pollinator and plant hosts using data described in Section 1.9.4.
- using the intensities of the pollinator species and plant species occurrence probabilities (see (a)), together with the interaction matrix (see (c)), estimate the alpha diversity index (specifically species richness, Shannon and Simpson indices) of the pollinators across Norway.
- extract the diversity indices, latitude and annual temperature value for the ASO meadow locations. Cluster the diversity indices based on the geo-climatic region (defined by the latitude and annual temperature; see Table 4), and estimate the maximum diversity for each geo-climatic region.
- scale the rest of the diversity indices by this indicator value to obtain nominal indicators for the study region. We then describe the ecosystem conditions using a threshold value of our choice.
We summarise these steps in Figure 1.
2. About the underlying data
The pollinator (GBIF.Org User 2024a) and plant species (GBIF.Org User 2024b) data used to fit the ISDMs were obtained from GBIF. The data was downloaded via the R-package rgbif (Chamberlain et al. 2024), and its associated metadata was extracted using the same R-package. The downloaded GBIF data contains \(26\) datasets and \(30\) datasets with records on pollinators and plants respectively. We present a summary of the datasets downloaded for the pollinator and plant species respectively in Table 6 and Table 7 respectively.
From the metadata, we ascertain the type of each dataset: either presence-only, presence-absence or count data. We merge all the presence-only data into one dataset (which we call mergedDatasetPO), but we do not merge the rest of the datasets. This is because the presence-absence data (processed from sampling event protocols) have different sampling protocols, and we would like to preserve their unique attributes. A summary of the formatted dataset used to fit the ISDMs are presented in Table 2 and Table 3.
Code
######################################################
# POLLINATOR DATA FORMAT
#####################################################
pollinatorFolder <- paste0(dataFolder,"/pollinatorDataFolder")
taxonKey <- c(811, 797, 1457)
formattedData <- list()
if(!file.exists(paste0(pollinatorFolder, "/formatData/formattedPollinatorData.RDS"))){
#for(i in seq_along(taxonKey)){
occurrences <- lapply(seq_along(taxonKey), function(i){
# select the taxaKey for a particular pollinator
taxaIndex <- taxonKey[i]
#if(!file.exists(paste0(pollinatorFolder, "/", taxaIndex,"formattedData.RDS"))){
if(!file.exists(paste0(pollinatorFolder, "/", taxaIndex,"occurrences.RDS"))){
# Get download key
if (file.exists(paste0(pollinatorFolder, "/",taxaIndex, "DownloadKey.RDS"))) {
downloadKey <- readRDS(paste0(pollinatorFolder, "/",taxaIndex, "DownloadKey.RDS"))
} else {
downloadKey <- getDownloadKey(taxaIndex, regionGeometry)
if(waitForGbif){
message("Download key has been created and will download once it is ready (5-30 minutes). ",
"View the download status at https://www.gbif.org/occurrence/download/",
downloadKey)
downloadKey <- occ_download_wait(downloadKey, curlopts = list(), quiet = FALSE)
base::attr(downloadKey,'doi') <- downloadKey$doi
saveRDS(downloadKey, file = paste0(pollinatorFolder, "/",taxaIndex, "DownloadKey.RDS"))
} else {
downloadKey <- occ_download_meta(downloadKey)
saveRDS(downloadKey, file = paste0(pollinatorFolder, "/",taxaIndex, "DownloadKey.RDS"))
stop(paste0("Download key has been created and your download is being prepared. View the download at https://www.gbif.org/occurrence/download/",
downloadKey$key, ". Come back and start the download in 5-30 minutes."))
}
}
# Start GBIF Download
#### FORMAT SCHEDULED DOWNLOAD ####
# This script takes your download key and downloads and produces a data frame containing all your
# relevant species occurrences. It also adds a few important columns.
# Download and unzip the file from GBIF
downloadGet <- occ_download_get(key=downloadKey$key, path=paste0(pollinatorFolder), overwrite=TRUE)
occurrences <- occ_download_import(downloadGet)
saveRDS(occurrences,
paste0(pollinatorFolder, "/", taxaIndex,"occurrences.RDS"))
} else {
occurrences <- readRDS(paste0(pollinatorFolder, "/", taxaIndex,"occurrences.RDS"))
}
# Create dataset types
# using the default creation from the hotspot project
# Import metadata information
if(file.exists(paste0(dataFolder, "/metadataSummary.csv"))){
dataTypes <- read.csv(paste0(dataFolder, "/metadataSummary.csv"))
}
# Reduce to relevant columns immediately to save space
occurrences <- occurrences[,c("acceptedScientificName", "decimalLongitude", "decimalLatitude", "basisOfRecord",
"year", "month", "datasetKey", "datasetName", "taxonRank", "kingdomKey", "phylumKey", "classKey", "orderKey",
"familyKey", "genusKey", "speciesKey", "issue", "genus", "order", "family", "occurrenceID")]
# Remove any occurrences with certain issues attached to them
issuesToFlag <- c("ZERO_COORDINATE|COORDINATE_OUT_OF_RANGE|COORDINATE_INVALID|COORDINATE_PRECISION_INVALID|COORDINATE_UNCERTAINTY_METRES_INVALID")
occurrences <- occurrences %>%
filter(datasetKey %in% dataTypes$datasetKey[!is.na(dataTypes$processing)]) %>%
filter(!grepl(issuesToFlag,issue))
# Assign a taxon key based on what level of taxonomy the key is valid for
occurrences$taxonKeyProject <- ifelse(occurrences$speciesKey %in% taxonKey, occurrences$speciesKey,
ifelse(occurrences$genusKey %in% taxonKey, occurrences$genusKey,
ifelse(occurrences$familyKey %in% taxonKey, occurrences$familyKey,
ifelse(occurrences$orderKey %in% taxonKey, occurrences$orderKey,
ifelse(occurrences$classKey %in% taxonKey, occurrences$classKey,
ifelse(occurrences$phylumKey %in% taxonKey, occurrences$phylumKey,
ifelse(occurrences$kingdomKey %in% taxonKey, occurrences$kingdomKey, NA)))))))
# Add taxa and reduce to relevant columns
occurrences$taxa <- taxaIndex #focalTaxon$taxa[match(occurrences$taxonKeyProject, focalTaxon$key)]
occurrences <- occurrences[,c("acceptedScientificName", "decimalLongitude", "decimalLatitude", "basisOfRecord",
"year", "month", "datasetKey", "datasetName", "taxa", "taxonKeyProject", "taxonRank", "genus", "order", "family", "occurrenceID")] %>%
filter(!is.na(taxa))
# }
})
# Put all groups together
occurrences <- occurrences %>%
do.call("rbind", .)%>%
dplyr::filter(order %in% c("Diptera", "Lepidoptera", "Hymenoptera"))%>%
dplyr::filter(year > 2009)%>%
dplyr::mutate(Taxon = ifelse(order == "Lepidoptera", "Butterflies",
ifelse(order == "Hymenoptera", "Bees",
"Hoverflies")))
###------------------------------###
### 3. Attach relevant metadata ####
###------------------------------###
# Now we import metadata related to GBIF data
metadataList <- metadataPrep(occurrences,
metaSummary = TRUE)
# Import dataset type based on dataset name. If no dataset information is provided, all data will be downloaded and assumed to
# be presence only data
GBIFImportCompiled <- merge(occurrences,
metadataList$metadata,
all.x=TRUE,
by = "datasetKey")
# Narrow down to known data types and split into data frames
projcrs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
GBIFLists <- lapply(unique(GBIFImportCompiled$name), FUN = function(x) {
GBIFItem <- GBIFImportCompiled[GBIFImportCompiled$name == x,]
GBIFItem <- st_as_sf(GBIFItem,
coords = c("decimalLongitude", "decimalLatitude"),
crs = projcrs)%>%
st_transform(., newCrs)
GBIFcropped <- st_intersection(GBIFItem, regionGeometry)
GBIFcropped
})
names(GBIFLists) <- unique(GBIFImportCompiled$name)
# Put the dataLists together and save them
dataList <- list(species = GBIFLists,
metadata = metadataList,
projcrs = newCrs)
base::attr(dataList, "level") <- base::attr(regionGeometry, "level")
base::attr(dataList, "region") <- base::attr(regionGeometry, "region")
saveRDS(dataList,
paste0(pollinatorFolder, "/speciesDataImported.RDS"))
##########################
# Process datasets
##########################
speciesData <- dataList[["species"]]
metadata <- dataList$metadata$metadata
rm("dataList")
# start processing each dataset
processedData <- list()
namesProcessedData <- c()
for (ds in seq_along(speciesData)) {
print(ds)
focalData <- speciesData[[ds]]
# If the dataset is empty, skip it
if (nrow(focalData) == 0) next
dataType <- unique(focalData$type)
#if(is.null(dataType)) dataType <- "OCCURRENCE"
datasetName <- names(speciesData)[ds]
newDataset <- NULL
cat("Currently processing dataset '", datasetName,"' \n", sep = "")
if (dataType == "SAMPLING_EVENT") {
focalEndpoint <- metadata$DWCEndpoint[metadata$name == datasetName]
newDataset <- tryCatch(processFieldNotesEventGenus(focalEndpoint, pollinatorFolder , datasetName, regionGeometry, taxonKey, newCrs, projcrs, focalData),
error = function(e) {NULL})
# No need to do anything to presence only data (yet) except add individualCount column
} else if (dataType == "OCCURRENCE") {
#ensure the coordinates are distrinct
# coordsgbif.no <- datagbif.no[,c("lon","lat")]
focalData <- focalData %>%
dplyr::group_by(Taxon) %>%
dplyr::distinct(geometry)
focalData$dataType <- "PO"
newDataset <- focalData[,c("acceptedScientificName", "geometry", "dataType", "taxa", "year", "taxonKeyProject", "order")]
} else if (dataType == "insectMonitoring") {
focalEndpoint <- metadata$DWCEndpoint[metadata$name == datasetName]
newDataset <- processNationalInsectMonitoring(focalData, focalEndpoint, pollinatorFolder, regionGeometry)
}
if (is.null(newDataset)) {next}
# convert year to numeric
newDataset$year <- as.numeric(newDataset$year)
# Save and name new dataset
processedData[[ds]] <- newDataset
namesProcessedData[ds] <- datasetName
}
names(processedData) <- namesProcessedData
#extract processed Data without NA names
processedData <- processedData[namesProcessedData[!is.na(namesProcessedData)]]
# Save for use in model construction
processedData <- processedData[lapply(processedData,nrow)>0]
saveRDS(processedData, paste0(pollinatorFolder, "/speciesDataProcessed.RDS"))
#Check if all the datasets are of the same dataType
mergedDataWithType <- lapply(processedData, function(x){
unique(x$dataType)
})%>%
do.call("rbind", .)%>%
as.data.frame(.)%>%
mutate(datasetName = rownames(.))%>%
rename(dataType = V1)
uniqueDataType <- unique(mergedDataWithType$dataType)
mergedDatasets <- list()
# Merge all presence-only data
dataToMerge <- mergedDataWithType[mergedDataWithType$dataType == "PO", ]
mergedDatasets[[1]] <- processedData[dataToMerge$datasetName]%>%
do.call("rbind", .)%>%
as.data.frame(., row.names = NULL)%>%
dplyr::group_by(acceptedScientificName, dataType, taxa, year, taxonKeyProject, order)%>%
distinct(geometry)%>%
dplyr::ungroup()
rownames(mergedDatasets[[1]]) <- NULL
mergedDatasets[[1]] <- mergedDatasets[[1]] %>%
st_as_sf(., crs = newCrs)%>%
filter(!is.na(order))%>%
dplyr::select(order, dataType) %>%
dplyr::group_by(dataType, order)%>%
distinct()%>%
dplyr::ungroup()
names(mergedDatasets)[1] <- paste0("mergedDataset", "PO")
# Format PA data
dataToMerge <- mergedDataWithType[mergedDataWithType$dataType == "PA", ]
paData <- lapply(dataToMerge$datasetName, function(x){
processedData[[x]] %>%
dplyr::group_by(individualCount, dataType, taxa, year, taxonKeyProject, order) %>%
distinct(geometry)%>%
dplyr::ungroup()
})
names(paData) <- dataToMerge$datasetName
# Format count data
dataToMerge <- mergedDataWithType[mergedDataWithType$dataType == "Counts", ]
countData <- processedData[dataToMerge$datasetName]
# Put all datasets togehter
mergedDatasets <- c(mergedDatasets,
countData,
paData
)
saveRDS(mergedDatasets,
paste0(pollinatorFolder, "/formatData/formattedPollinatorData.RDS"))
} else {
pollinatorData <- readRDS(paste0(pollinatorFolder, "/formatData/formattedPollinatorData.RDS"))
}
# Format the data
PointsBeesAndButterfliesAndHoverflies <- lapply(pollinatorData, function(x){
ret <- x %>%
dplyr::mutate(Taxon = ifelse(order == "Lepidoptera", "Butterflies",
ifelse(order == "Hymenoptera", "Bees",
"Hoverflies")))
return(ret)
})
# Shorten the names of the datasets
shortNames <- str_replace_all(names(PointsBeesAndButterfliesAndHoverflies), " ", "")
names(PointsBeesAndButterfliesAndHoverflies) <- shortNames
# We need to filter out some taxonomic groups
beesOnly <- c("Solitarybeescollectedinalarge-scalefieldexperimentinpowerlineclearings,southeastNorway" , "Bumblebeescollectedinalarge-scalefieldexperimentinpowerlineclearings,southeastNorway")
ret <- lapply(PointsBeesAndButterfliesAndHoverflies[beesOnly], function(x){
x %>%
filter(Taxon == "Bees")
})
PointsBeesAndButterfliesAndHoverflies[beesOnly] <- ret
beesAndButterfles <- c("BumblebeesandbutterfliesinNorway")
ret <- lapply(PointsBeesAndButterfliesAndHoverflies[beesAndButterfles], function(x){
x %>%
filter(Taxon != "Hoverflies")
})
PointsBeesAndButterfliesAndHoverflies[beesAndButterfles] <- ret
################################################
# PLANT SPECIES FORMAT
####################################################
# The plant data format is similar to the pollinators
# Except we select the species within the pollinator-plants interaction data.
if(!file.exists(paste0(dataFolder, "/plantDataFolder/formattedData/formattedPlantData.RDS"))){
taxonKey <- 7707728
if (file.exists(paste0(dataFolder, "/downloadKey.RDS"))) {
downloadKey <- readRDS(paste0(dataFolder, "/downloadKey.RDS"))
} else {
downloadKey <- getDownloadKey(taxonKey, regionGeometry)
if(waitForGbif){
message("Download key has been created and will download once it is ready (5-30 minutes). ",
"View the download status at https://www.gbif.org/occurrence/download/",
downloadKey)
downloadKey <- occ_download_wait(downloadKey, curlopts = list(), quiet = FALSE)
base::attr(downloadKey,'doi') <- downloadKey$doi
saveRDS(downloadKey, file = paste0(dataFolder, "/downloadKey.RDS"))
} else {
downloadKey <- occ_download_meta(downloadKey)
saveRDS(downloadKey, file = paste0(dataFolder, "/downloadKey.RDS"))
stop(paste0("Download key has been created and your download is being prepared. View the download at https://www.gbif.org/occurrence/download/",
downloadKey$key, ". Come back and start the download in 5-30 minutes."))
}
}
# Start GBIF Download
#### FORMAT SCHEDULED DOWNLOAD ####
# This script takes your download key and downloads and produces a data frame containing all your
# relevant species occurrences. It also adds a few important columns.
# Download and unzip the file from GBIF
if(!file.exists(paste0(dataFolder, "/plantDataFolder/data/occurrences.RDS"))){
downloadGet <- occ_download_get(key=downloadKey$key, path=paste0(dataFolder, "/plantDataFolder/data"), overwrite=TRUE)
occurrences <- occ_download_import(downloadGet)
saveRDS(occurrences,
paste0(dataFolder, "/plantDataFolder/data/occurrences.RDS"))
} else {
occurrences <- readRDS(paste0(dataFolder, "/plantDataFolder/data/occurrences.RDS"))
}
# Create dataset types
# using the default creation from the hotspot project
# Import metadata information
if(file.exists(paste0(dataFolder, "/metadataSummary.csv"))){
dataTypes <- read.csv(paste0(dataFolder, "/metadataSummary.csv"))
}
# Reduce to relevant columns immediately to save space
occurrences <- occurrences[,c("acceptedScientificName", "decimalLongitude", "decimalLatitude", "basisOfRecord",
"year", "month", "datasetKey", "datasetName", "taxonRank", "kingdomKey", "phylumKey", "classKey", "orderKey",
"familyKey", "genusKey", "speciesKey", "issue", "genus", "order")]
# Remove any occurrences with certain issues attached to them
issuesToFlag <- c("ZERO_COORDINATE|COORDINATE_OUT_OF_RANGE|COORDINATE_INVALID|COORDINATE_PRECISION_INVALID|COORDINATE_UNCERTAINTY_METRES_INVALID")
occurrences <- occurrences %>%
filter(datasetKey %in% dataTypes$datasetKey[!is.na(dataTypes$processing)]) %>%
filter(!grepl(issuesToFlag,issue))
# Assign a taxon key based on what level of taxonomy the key is valid for
occurrences$taxonKeyProject <- ifelse(occurrences$speciesKey %in% taxonKey, occurrences$speciesKey,
ifelse(occurrences$genusKey %in% taxonKey, occurrences$genusKey,
ifelse(occurrences$familyKey %in% taxonKey, occurrences$familyKey,
ifelse(occurrences$orderKey %in% taxonKey, occurrences$orderKey,
ifelse(occurrences$classKey %in% taxonKey, occurrences$classKey,
ifelse(occurrences$phylumKey %in% taxonKey, occurrences$phylumKey,
ifelse(occurrences$kingdomKey %in% taxonKey, occurrences$kingdomKey, NA)))))))
# Add taxa and reduce to relevant columns
occurrences$taxa <- "plants"#focalTaxon$taxa[match(occurrences$taxonKeyProject, focalTaxon$key)]
occurrences <- occurrences[,c("acceptedScientificName", "decimalLongitude", "decimalLatitude", "basisOfRecord",
"year", "month", "datasetKey", "datasetName", "taxa", "taxonKeyProject", "taxonRank", "genus", "order")] %>%
filter(!is.na(taxa))
# Get the data with interractions
source("pipeline/webForInterractions/webPlotsForModels.R")
# select only the species within the interraction matrix
occurrences <- occurrences %>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)
)%>%
dplyr::filter(simpleScientificName %in% interractionMatrix$species)%>%
dplyr::filter(year > 2009)
###------------------------------###
### 3. Attach relevant metadata ####
###------------------------------###
# Now we import metadata related to GBIF data
metadataList <- metadataPrep(occurrences, metaSummary = TRUE)
# Import dataset type based on dataset name. If no dataset information is provided, all data will be downloaded and assumed to
# be presence only data
GBIFImportCompiled <- merge(occurrences,
metadataList$metadata,
all.x=TRUE,
by = "datasetKey")
# Narrow down to known data types and split into data frames
projcrs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
GBIFLists <- lapply(unique(GBIFImportCompiled$name), FUN = function(x) {
GBIFItem <- GBIFImportCompiled[GBIFImportCompiled$name == x,]
GBIFItem <- st_as_sf(GBIFItem,
coords = c("decimalLongitude", "decimalLatitude"),
crs = projcrs)%>%
st_transform(., newCrs)
GBIFcropped <- st_intersection(GBIFItem, regionGeometry)
GBIFcropped
})
names(GBIFLists) <- unique(GBIFImportCompiled$name)
# Put the dataLists together and save them
dataList <- list(species = GBIFLists,
metadata = metadataList,
projcrs = newCrs)
base::attr(dataList, "level") <- base::attr(regionGeometry, "level")
base::attr(dataList, "region") <- base::attr(regionGeometry, "region")
saveRDS(dataList,
paste0(dataFolder, "/plantDataFolder/formattedData/speciesDataImported.RDS"))
##########################
# Process datasets
##########################
speciesData <- dataList[["species"]]
metadata <- dataList$metadata$metadata
rm("dataList")
# start processing each dataset
processedData <- list()
namesProcessedData <- c()
for (ds in seq_along(speciesData)) {
focalData <- speciesData[[ds]]
# If the dataset is empty, skip it
if (nrow(focalData) == 0) next
dataType <- unique(focalData$type)
#if(is.null(dataType)) dataType <- "OCCURRENCE"
datasetName <- names(speciesData)[ds]
newDataset <- NULL
cat("Currently processing dataset '", datasetName,"' \n", sep = "")
if (dataType == "SAMPLING_EVENT") {
surveyedSpecies <- interractionMatrix$species
focalEndpoint <- metadata$DWCEndpoint[metadata$name == datasetName]
newDataset <- tryCatch(processFieldNotesEvent(focalEndpoint, dataFolder , datasetName, regionGeometry, taxonKey, newCrs, projcrs, focalData, surveyedSpecies),
error = function(e) {NULL})
# No need to do anything to presence only data (yet) except add individualCount column
} else if (dataType == "OCCURRENCE") {
focalData$dataType <- "PO"
newDataset <- focalData[,c("acceptedScientificName", "geometry", "dataType", "taxa", "year", "taxonKeyProject", "genus", "order")]
}
if (is.null(newDataset)) {break}
# convert year to numeric
newDataset$year <- as.numeric(newDataset$year)
# Save and name new dataset
processedData[[ds]] <- newDataset
namesProcessedData[ds] <- datasetName
}
names(processedData) <- namesProcessedData
# Save for use in model construction
processedData <- processedData[lapply(processedData,nrow)>0]
saveRDS(processedData, paste0(dataFolder, "/plantDataFolder/formattedData/speciesDataProcessed.RDS"))
###--------------------------------###
### 3. Compile into one data.frame ####
###--------------------------------###
# Edit data frames to have the same number of columns
processedDataCompiled <- do.call(rbind, lapply(1:length(processedData), FUN = function(x) {
dataset <- processedData[[x]]
datasetName <- names(processedData)[x]
datasetType <- unique(dataset$dataType)
if (datasetType == "PO") {
dataset$individualCount <- 1
}
datasetShort <- dataset[, c("acceptedScientificName", "individualCount", "geometry", "taxa", "year", "dataType",
"taxonKeyProject", "genus", "order")]
datasetShort$dsName <- datasetName
datasetShort
}))
# Remove absences, combine into one data frame and add date accessed
processedPresenceData <- processedDataCompiled[processedDataCompiled$individualCount > 0,]
saveRDS(processedPresenceData, paste0(dataFolder, "/plantDataFolder/processedPresenceData.RDS"))
rm("GBIFLists")
rm("GBIFImportCompiled")
rm("newDataset")
############################
# Matching Processed data species to that from ASO data
#########################
# We are using ASO data as the basis for modelling the plant interractions
# So the species in the ASOdata are selected and
# all datasets are filtered to have those species
asoDataNames <- processedData[[19]]
saveRDS(asoDataNames, paste0(dataFolder, "/plantDataFolder/formattedData/asoData.RDS"))
asoDataNames <- unique(asoDataNames$acceptedScientificName)
# selecting the ASO data species
formattedData <- lapply(processedData, function(x){
out <- x %>%
dplyr::filter(acceptedScientificName %in% asoDataNames)
return(out)
})
rm("processedData")
rm("occurrences")
rm("speciesData")
#Check if all the datasets are of the same dataType
mergedDataWithType <- lapply(formattedData, function(x){
unique(x$dataType)
})%>%
do.call("rbind", .)%>%
as.data.frame(.)%>%
mutate(datasetName = rownames(.))%>%
rename(dataType = V1)
uniqueDataType <- unique(mergedDataWithType$dataType)
# create the merged dataset
mergedDatasets <- list()
#for(i in seq_along(uniqueDataType)){
# Merge all Presence-only datasets together
dataToMerge <- mergedDataWithType[mergedDataWithType$dataType == "PO", ]
mergedDatasets[[1]] <- formattedData[dataToMerge$datasetName]%>%
do.call("rbind", .)%>%
as.data.frame(., row.names = NULL)
rownames(mergedDatasets[[1]]) <- NULL
mergedDatasets[[1]] <- mergedDatasets[[1]] %>%
st_as_sf(., crs = newCrs)%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)
)%>%
dplyr::filter(simpleScientificName %in% interractionMatrix$species)
names(mergedDatasets)[1] <- paste0("mergedDataset", "PO")
otherData <- mergedDataWithType[!mergedDataWithType$dataType %in% "PO", "datasetName"] %>%
lapply(., function(x){
print(x)
#ds <- which(x %in% mergedDataWithType[!mergedDataWithType$dataType %in% "PO", "datasetName"])
mergeData <- formattedData[x][[1]]%>%
st_as_sf(., crs = newCrs)%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)
)%>%
dplyr::filter(simpleScientificName %in% interractionMatrix$species)
#names(mergeData) <- x
return(mergeData)
})
names(otherData) <- mergedDataWithType[!mergedDataWithType$dataType %in% "PO", "datasetName"]
mergedDatasets <- c(mergedDatasets,
otherData)
# Kepp the individual Presence-only data
#}
# save the data
saveRDS(mergedDatasets, paste0(dataFolder, "/plantDataFolder/formattedData/formattedPlantData.RDS"))
} else {
plantSpeciesData <- readRDS(paste0(dataFolder, "/plantDataFolder/formattedData/formattedPlantData.RDS"))
} | Dataset name | No. of bees occurrences | No. of butterflies occurrences | No. of hoverflies occurrences |
|---|---|---|---|
| mergedDatasetPO | 22316 | 61009 | 17555 |
| National insect monitoring in Norway | 182 | 157 | 241 |
| Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway | 5 | 5 | 0 |
| Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway | 391 | 0 | 0 |
| Freshwater benthic invertebrates ecological collection NTNU University Museum | 3562 | 3571 | 5439 |
| Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway | 78 | 0 | 0 |
| Bumblebees and butterflies in Norway | 5996 | 5370 | 0 |
| Freshwater pelagic invertebrates ecological collection NTNU University Museum | 1605 | 0 | 0 |
| Species | mergedDatasetPO | Monitoring data of natural and man-made semi-natural meadows in and around Oslo, Norway 2018-2021 | Vegetation data with and without experimental warming, alpine Finse 2000, 2004, 2011 | Effects of vegetation clearing on vascular plants in power line clearings southeast Norway | Vascular plants in power line clearings and the nearby forest, southeast Norway | Overvåking av semi-naturlig eng (ASO) |
|---|---|---|---|---|---|---|
| Ajuga pyramidalis | 4003 | 22 | 720 | 1533 | 599 | 147 |
| Astragalus alpinus | 1851 | 22 | 720 | 1533 | 599 | 147 |
| Campanula rotundifolia | 11703 | 22 | 720 | 1533 | 599 | 147 |
| Carum carvi | 2584 | 22 | 720 | 1533 | 599 | 147 |
| Filipendula ulmaria | 15967 | 22 | 720 | 1533 | 599 | 147 |
| Galium album | 3624 | 22 | 720 | 1533 | 599 | 147 |
| Galium aparine | 1722 | 22 | 720 | 1533 | 599 | 147 |
| Galium boreale | 7360 | 22 | 720 | 1533 | 599 | 147 |
| Galium elongatum | 1399 | 22 | 720 | 1533 | 599 | 147 |
| Galium palustre | 2180 | 22 | 720 | 1533 | 599 | 147 |
| Galium saxatile | 1299 | 22 | 720 | 1533 | 599 | 147 |
| Galium uliginosum | 1761 | 22 | 720 | 1533 | 599 | 147 |
| Galium verum | 3276 | 22 | 720 | 1533 | 599 | 147 |
| Geranium robertianum | 5497 | 22 | 720 | 1533 | 599 | 147 |
| Geranium sylvaticum | 13861 | 22 | 720 | 1533 | 599 | 147 |
| Gymnadenia conopsea | 3179 | 22 | 720 | 1533 | 599 | 147 |
| Hieracium murorum | 20 | 22 | 720 | 1533 | 599 | 147 |
| Hieracium umbellatum | 3663 | 22 | 720 | 1533 | 599 | 147 |
| Hieracium vulgatum | 26 | 22 | 720 | 1533 | 599 | 147 |
| Hypochaeris radicata | 1287 | 22 | 720 | 1533 | 599 | 147 |
| Knautia arvensis | 6030 | 22 | 720 | 1533 | 599 | 147 |
| Lathyrus linifolius | 6118 | 22 | 720 | 1533 | 599 | 147 |
| Lathyrus pratensis | 5458 | 22 | 720 | 1533 | 599 | 147 |
| Lathyrus vernus | 2013 | 22 | 720 | 1533 | 599 | 147 |
| Leucanthemum vulgare | 9160 | 22 | 720 | 1533 | 599 | 147 |
| Lotus corniculatus | 11190 | 22 | 720 | 1533 | 599 | 147 |
| Nardus stricta | 5966 | 22 | 720 | 1533 | 599 | 147 |
| Origanum vulgare | 2178 | 22 | 720 | 1533 | 599 | 147 |
| Plantago lanceolata | 5091 | 22 | 720 | 1533 | 599 | 147 |
| Plantago major | 6525 | 22 | 720 | 1533 | 599 | 147 |
| Plantago media | 2045 | 22 | 720 | 1533 | 599 | 147 |
| Silene dioica | 6888 | 22 | 720 | 1533 | 599 | 147 |
| Silene vulgaris | 3343 | 22 | 720 | 1533 | 599 | 147 |
| Stellaria graminea | 7784 | 22 | 720 | 1533 | 599 | 147 |
| Stellaria media | 3723 | 22 | 720 | 1533 | 599 | 147 |
| Stellaria nemorum | 3694 | 22 | 720 | 1533 | 599 | 147 |
| Trifolium medium | 4394 | 22 | 720 | 1533 | 599 | 147 |
| Trifolium pratense | 11519 | 22 | 720 | 1533 | 599 | 147 |
| Trifolium repens | 10479 | 22 | 720 | 1533 | 599 | 147 |
| Trollius europaeus | 2856 | 22 | 720 | 1533 | 599 | 147 |
| Valeriana sambucifolia | 106 | 22 | 720 | 1533 | 599 | 147 |
| Veronica alpina | 403 | 22 | 720 | 1533 | 599 | 147 |
| Veronica chamaedrys | 8302 | 22 | 720 | 1533 | 599 | 147 |
| Veronica officinalis | 10738 | 22 | 720 | 1533 | 599 | 147 |
| Veronica serpyllifolia | 711 | 22 | 720 | 1533 | 599 | 147 |
| Vicia cracca | 9257 | 22 | 720 | 1533 | 599 | 147 |
| Vicia sepium | 5705 | 22 | 720 | 1533 | 599 | 147 |
| Vicia sylvatica | 1969 | 22 | 720 | 1533 | 599 | 147 |
| Viola biflora | 2990 | 22 | 720 | 1533 | 599 | 147 |
| Viola canina | 3163 | 22 | 720 | 1533 | 599 | 147 |
| Viola epipsila | 423 | 22 | 720 | 1533 | 599 | 147 |
| Viola palustris | 7853 | 22 | 720 | 1533 | 599 | 147 |
| Viola riviniana | 8650 | 22 | 720 | 1533 | 599 | 147 |
| Viola tricolor | 3667 | 22 | 720 | 1533 | 599 | 147 |
2.1 Spatial and temporal resolution
For both pollinator and plant data obtained on a National scale (entire Norway), we used those observed within \(2010\) to \(2024\). We present the study region in Figure 2.
We describe the indicators at the semi-natural habitats from the ASO Data meadows. The ASO Data locations and their presence/absence records are presented in Figure 10.
2.2 Original units
The original unit of the each dataset obtained from GBIF for the pollinators and plants are provided in Table 6 and Table 7 respectively. The dataset are either presence-only and presence-absence.
2.3 Additional comments about the dataset
3. Indicator properties
3.1. ECT
B1 - Compositional state characteristics
The indicator describes the diversity in biological communities.
3.2. Ecosystem condition characteristic
High indicator values represents meadows with intact plant communities that can facilitate the life cycles of many pollinators.
3.3. Other standards
In the Norwegian typology, yhe indicator is classed as biologisk mangfold.
3.4. Collinearities with other indicators
The indicator may be correlated with other indicators of plant or insect diversity, such as those using data from the Norwegian national insect monitoring.
4. Reference condition and values
4. 1. Reference condition
The reference condition is defined as the maximum ecological potential of pollinators in semi-natural meadows in Norway. This reference condition represents a state where all conditional indicators have a value of \(1\) (Keith et al. 2020). Our reference conditions are based on modeled reference conditions (Hickler et al. 2012; Keith et al. 2020) of ASO meadows. Specifically, the pollinator potential at a given reference condition represents the highest diversity within a geo-climatic region. Although we estimated three alpha diversity indices (see Section 1.9.5), we chose to use the inverse-Simpson index to describe the diversity, since it disproportionately favors common species than the species richness and Shannon-Weiener indices (Jost 2006). An increase in the inverse Simpson index reflects an increase in diversity (Zhou et al. 2002).
An advantage of this model-based reference condition is that we can describe a national indicators for difference landscapes or ecosystems, with reference to the semi-natural habitats or ecosystems, and describe ecosystem variables that can affect the conditions within these landscapes. As a downside, it becomes difficult to scale conditions at levels with indicator values higher than the reference (Keith et al. 2020), especially for regions where alpha diversity is higher than those in the semi-natural grasslands.
4. 2. Reference values
As defined by Keith et al. (2020), we define the value of our ecosystem variable (alpha diversity described in Section 1.9.5) at reference condition and this value will be used to re-scale a variable to derive individual indicators. Putting all pollinators together, we set \(0\) and the maximum diversity value as the endpoints of the range of a condition variable we use in re-scaling. A value of \(0\) shows a poor condition where there is no pollinator diversity and a value of \(1\) indicates a good condition with the highest potential pollinator diversity.
In summary, there are two steps to obtaining these reference values:
- Estimating the maximum values and re-scaling the ecosystem variable
- Defining a threshold value for defining good ecological condition
4.2.1 Maximum values
We divide our semi-natural meadows into geo-climatic regions (that is, they are clustered according to their latitude and annual temperature values). This is a simplification as compared to using a more stable and partly historic vegetation zones (described by feltkurskompendium biologisk mangfold terrestrisk biologi and store norske leksikon). Moreover, this clusters allows us to track species distribution with rising temperatures and the latitude accounts for lag time in establishment [REF].
In each of these clusters, we estimate the maximum alpha diversity index and use this value as reference value for that geo-climatic region. The estimates of the maximum diversity indices within each geo-climatic regions is presented in Table 4.
| Geo climatic region | Latitude | Temperature | Maximum value |
|---|---|---|---|
| 1 | Positive | Positive | 7714.728 |
| 2 | Negative | Positive | 7552.012 |
| 4 | Negative | Negative | 6021.358 |
Using the reference values presented in Table 4, we use a linear transformation to calculate the ecosystem condition indicators (Keith et al. 2020): \[ Indicator = \frac{D - V_L}{V_H - V_L}; \] where \(D\) is the alpha diversity estimate, \(V_L\) is the lowest reference level (which we have set to \(0\)) and \(V_H\) is the maximum reference value. We present these indicator values for the ASO meadows and the national scale in Figure 3. Note that the indicators on the national scale needs to be interpreted with reference to the ASO meadows.
Code
speciesForTaxon <- biotraits %>%
dplyr::select(acceptedScientificName, Taxon, zone)%>%
group_by(Taxon, zone)%>%
distinct()%>%
ungroup()%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
))%>%
na.omit()
# Load all the plant predictions result
if(!exists("interractionMatrix")) source("pipeline/webForInterractions/webPlotsForModels.R")
asoDatasf <- readRDS(file = paste0(dataFolder, "/plantDataFolder/formattedData/asoData.RDS"))%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)) %>%
dplyr::filter(simpleScientificName %in% c(interractionMatrix$species)) #
plantSpecies <- unique(asoDatasf$simpleScientificName)
plantSpecies <- sapply(plantSpecies, function(x){
rt <- stringr::str_split(x, " ")[[1]]
taxon <- paste(rt[1], rt[2], sep = "_")
return(taxon)
}) %>%
c()
plantSpecies <- as.character(plantSpecies)
##################################################
# Plant Species results
#########################################
speciesData <- lapply(paste0(dataFolder, "/modelOutputs/plants/", plantSpecies), function(x){
# try(list.files(x, pattern = paste0("predictions.rds"), recursive = TRUE, full.names = TRUE))
preds <- readRDS(list.files(x, pattern = paste0("predictions.rds"), recursive = TRUE, full.names = TRUE))
return(preds$Probabilities)
})|>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(plantSpecies)
if(rerun){
######################################
# ZONAL DIVERSITY ESTIMATES
#####################################
zone <- c("alpine", "boreal", "temperate")
# Make predictions for all the zones
sp <- "allTaxa"
spPred <- estimateTaxonDiversity(pollinatorSpeciesNames = NULL,
speciesForTaxon = speciesForTaxon,
taxa = sp,
predRast = predRast,
silent = FALSE)
path <- paste0(dataFolder, "/modelOutputs/diversity/", sp)
if(!file.exists(path)){
dir.create(path)
}
saveRDS(spPred, file.path(path, paste0("predictions.rds")))
} else {
# load(paste0(dataFolder, "/modelOutputs/diversity/alpineDiversity.RData"))
# load(paste0(dataFolder, "/modelOutputs/diversity/borealDiversity.RData"))
alpine <- readRDS("Data/modelOutputs/diversity/alpine/predictions.rds")
boreal <- readRDS("Data/modelOutputs/diversity/boreal/predictions.rds")
temperate <- readRDS("Data/modelOutputs/diversity/temperate/predictions.rds")
bees <- readRDS("Data/modelOutputs/diversity/Bees/predictions.rds")
butterflies <- readRDS("Data/modelOutputs/diversity/Butterflies/predictions.rds")
hoverflies <- readRDS("Data/modelOutputs/diversity/Hoverflies/predictions.rds")
}
#############################
# Diverisity at ASOData locations
#############################
source("pipeline/webForInterractions/webPlotsForModels.R")
asoDatasf <- readRDS(file = paste0(dataFolder, "/plantDataFolder/formattedData/asoData.RDS"))%>%
mutate(
simpleScientificName = coalesce(
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)) %>%
dplyr::filter(simpleScientificName %in% c(interractionMatrix$species))%>%
st_transform(newCrs)# Select the species in the interraction matrix
source("pipeline/dataImport/environmentalDataImport.R")
names(environmentalCovariates)[14] <- "bio.sq"
names(environmentalCovariates)[16] <- "bm.Intsq"
# Reclassify the levels of the environmental covariates
levels(environmentalCovariates$landCover)[[1]][,2][c(18, 19, 23, 24, 25, 26)] <- "Other forest"
levels(environmentalCovariates$landCover)[[1]][,2][29:37] <- "Others"
# Reclassify the levels of environmentalCovariates randomised
levels(environmentalCovariatesRandomized$landCover)[[1]][,2][c(18, 19, 23, 24, 25, 26)] <- "Other forest"
levels(environmentalCovariatesRandomized$landCover)[[1]][,2][29:37] <- "Others"
# Estract the covariates and diversity values at these locations
asoDataDiv <- terra::extract(allRast,
unique(vect(st_geometry(asoDatasf))))
asoDataCovs <- terra::extract(environmentalCovariates,
unique(vect(st_geometry(asoDatasf))))
asoDataMeadows <- cbind(asoDataDiv,
asoDataCovs,
crds(unique(vect(st_geometry(asoDatasf)))))%>%
st_as_sf(., coords = c("x", "y"), crs = newCrs)
asoDataMeadows <- asoDataMeadows[complete.cases(asoDataMeadows%>% st_drop_geometry()), ]%>%
rowwise()%>%
mutate(sumShannon = sum(c_across(c(3, 6, 9)))) %>%
mutate(cluster = ifelse(Latitude > 0 & bio1 > 0, 1,
ifelse(Latitude < 0 & bio1 > 0, 2,
ifelse(Latitude > 0 & bio1 < 0, 3, 4))))%>%
dplyr::group_by(cluster)%>%
dplyr::mutate(alpineReference = alpine_simpson/ max(allZones_simpson),
borealReference = boreal_simpson/ max(allZones_simpson),
temperateReference = temperate_simpson/ max(allZones_simpson),
allReference = allZones_simpson / max(allZones_simpson),
maxReference = max(allZones_simpson),
minReference = min(allZones_simpson))%>%
ungroup()
#save the asoData meadows for plot later
save(asoDataMeadows,
file = paste0(dataFolder, "/markdownResults/image/asoDataMeadows.RData"))4.2.2. Threshold value for defining good ecological condition (if relevant)
Given the indicator values estimated in Section 1.4.2.1, we define threshold values for defining a good ecosystem condition. These choose three threshold values and assess the impacts of the choice of threshold on the ecosystem condition:
Threshold of 0.5
Threshold of 0.6 (used by Framstad et al. (2022) in defining condition of forest and mountain ecosystem in Norway)
Threshold of 0.75
Any location or grid cell or meadow with its indicator greater than this threshold is deemed to be in a good condition, otherwise in a poor condition. Figure 4 presents the conditions of pollinators in ASO meadows.
Code
interestedVars <- c("alpine_simpson", "boreal_simpson", "temperate_simpson", "allZones_simpson")
selectedVars <- lapply(interestedVars, function(x){
allRast[[x]]
}) %>%
rast()%>%
as.data.frame(na.rm = TRUE, xy = TRUE)%>%
st_as_sf(., coords = c("x", "y"), crs = newCrs)%>%
cbind(., terra::extract(environmentalCovariates,
unique(vect(st_geometry(.)))))%>%
dplyr::mutate(allReference = ifelse(Latitude > 0 & bio1 > 0, (allZones_simpson/7714.728),
ifelse(Latitude < 0 & bio1 > 0, (allZones_simpson/7552.012),
ifelse(Latitude > 0 & bio1 < 0, (allZones_simpson/7714.728),
(allZones_simpson/6021.358)))))%>%
dplyr::mutate(condition75 = ifelse(allReference > 0.75, "Good", "Poor"),
condition60 = ifelse(allReference > 0.60, "Good", "Poor"),
condition50 = ifelse(allReference > 0.5, "Good", "Poor"))%>%
select(allReference, condition50, condition60, condition75)
save(selectedVars,
file = paste0(dataFolder, "/markdownResults/image/ecosystemConditions.RData"))What does this mean and how do we interpret these thresholds. Does these conditions make sense and how do we interpret those?
4.2.3. Spatial resolution and validity
The reference values are estimated on \(500\) meter resolution and it is fixed across all areas. We also present the pollinator condition on the national scale in Figure 5.
5. Uncertainties
Instead of resampling our data to define the uncertainties (Framstad et al. 2022), we use the 95% from the pollinator intensity and plant species occurrence probability.
This can be done, but I need some time to do this.
6. References
7. Datasets
Here, we describe each of the dataset used for the modelling. We refer the readers to the dataset description provided on the GBIF website.
| Dataset | Link to description |
|---|---|
| Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway | https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee |
| Bumblebees and butterflies in Norway | https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee |
| Ecofact | https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee |
| Entomological collections, UiB | [link](https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee) |
| Entomology collection, UiT Tromsø Museum | NA |
| Entomology, Oslo (O) UiO | NA |
| Freshwater benthic invertebrates ecological collection NTNU University Museum | NA |
| Freshwater pelagic invertebrates ecological collection NTNU University Museum | NA |
| Insect collection, UiT, University Museum (TSZ). Insect labeling project and PhD-duty-work | NA |
| Insects of the Forest-Tundra Ecotone (ForTunE) | NA |
| Invertebrate collections, UiB | NA |
| Jordal | NA |
| Lepidoptera collection, South Norway | NA |
| Mapping and monitoring of the Glanville fritillary Melitaea cinxia | NA |
| NINA insect database | NA |
| NORSC - Sciaroidea, UiT Tromsø Museum | NA |
| National insect monitoring in Norway | https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee |
| Norwegian Biodiversity Information Centre - Other datasets | NA |
| Norwegian Species Observation Service | NA |
| Observation.org, Nature data from around the World | NA |
| Occurrence data from various smaller projects in Norway | NA |
| Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway | https://www.gbif.org/dataset/19fe96b0-0cf3-4a2e-90a5-7c1c19ac94ee |
| Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway | NA |
| Terrestrial and limnic invertebrates systematic collection, NTNU University Museum | NA |
| iNaturalist Research-grade Observations | NA |
| naturgucker | NA |
Here, I intend to link all the datasets to their correct version on the GBIF repository.
7.1 Pollinator datasets
| Dataset name | Data type | No. of observations | Percent |
|---|---|---|---|
| Bumble bees collected in a large-scale field experiment in power line clearings, southeast Norway | SAMPLING_EVENT | 5016 | 0.28 |
| Bumblebees and butterflies in Norway | SAMPLING_EVENT | 31112 | 1.75 |
| Ecofact | OCCURRENCE | 2 | 0.00 |
| Entomological collections, UiB | OCCURRENCE | 2757 | 0.16 |
| Entomology collection, UiT Tromsø Museum | OCCURRENCE | 2 | 0.00 |
| Entomology, Oslo (O) UiO | OCCURRENCE | 63124 | 3.56 |
| Freshwater benthic invertebrates ecological collection NTNU University Museum | SAMPLING_EVENT | 11559 | 0.65 |
| Freshwater pelagic invertebrates ecological collection NTNU University Museum | SAMPLING_EVENT | 17 | 0.00 |
| Insect collection, UiT, University Museum (TSZ). Insect labeling project and PhD-duty-work | OCCURRENCE | 2048 | 0.12 |
| Insects of the Forest-Tundra Ecotone (ForTunE) | OCCURRENCE | 216 | 0.01 |
| Invertebrate collections, UiB | OCCURRENCE | 14 | 0.00 |
| Jordal | OCCURRENCE | 48 | 0.00 |
| Lepidoptera collection, South Norway | OCCURRENCE | 613 | 0.03 |
| Mapping and monitoring of the Glanville fritillary Melitaea cinxia | OCCURRENCE | 785 | 0.04 |
| NINA insect database | OCCURRENCE | 21342 | 1.20 |
| NORSC - Sciaroidea, UiT Tromsø Museum | OCCURRENCE | 6886 | 0.39 |
| National insect monitoring in Norway | insectMonitoring | 610146 | 34.41 |
| Norwegian Biodiversity Information Centre - Other datasets | OCCURRENCE | 21 | 0.00 |
| Norwegian Species Observation Service | OCCURRENCE | 971031 | 54.75 |
| Observation.org, Nature data from around the World | OCCURRENCE | 4636 | 0.26 |
| Occurrence data from various smaller projects in Norway | OCCURRENCE | 7 | 0.00 |
| Saproxylic insects caught in window traps and hatched from polypores in small and large old forests in southern Norway | SAMPLING_EVENT | 122 | 0.01 |
| Solitary bees collected in a large-scale field experiment in power line clearings, southeast Norway | SAMPLING_EVENT | 2699 | 0.15 |
| Terrestrial and limnic invertebrates systematic collection, NTNU University Museum | OCCURRENCE | 32654 | 1.84 |
| iNaturalist Research-grade Observations | OCCURRENCE | 6456 | 0.36 |
| naturgucker | OCCURRENCE | 99 | 0.01 |
Out of the 26 datasets obtained from GBIF, over half of the total occurrence reports for the pollinators were from the Norwegian species Observation Service, a third of the total occurrence report were from the National insect monitoring in Norway, and the other 24 datasets cover 22 percent of the total pollinator occurrences.
After processing the data, it is important that we check that there are only information on bees from the following datasets: Solitary bees collected in a large-scale field experiment in power line clearings (southeast Norway), Bumble bees collected in a large-scale field experiment in power line clearings (southeast Norway), as can be seen in Figure 6 and Figure 7.
We also check that we have information on only bees and butterflies from the Bumblebees and butterflies in Norway dataset, displayed in Figure 8.
7.2. Plant datasets
From our query for datasets from GBIF, we got a total of 30 datasets which we used to fit the ISDM for the plants. Out of these occurrence records, about 86 percent of these records were from the Norwegian species Observation Service.
| Dataset name | Data type | No. of observations | Percent |
|---|---|---|---|
| ARKO strandeng | OCCURRENCE | 274 | 0.09 |
| Ecofact | OCCURRENCE | 51 | 0.02 |
| Effects of vegetation clearing on vascular plants in power line clearings southeast Norway | SAMPLING_EVENT | 779 | 0.24 |
| Jordal | OCCURRENCE | 5434 | 1.70 |
| Monitoring data of natural and man-made semi-natural meadows in and around Oslo, Norway 2018-2021 | SAMPLING_EVENT | 2310 | 0.72 |
| Norwegian Species Observation Service | OCCURRENCE | 288837 | 90.51 |
| Observation.org, Nature data from around the World | OCCURRENCE | 2993 | 0.94 |
| Occurrence data from various smaller projects in Norway | OCCURRENCE | 775 | 0.24 |
| Overvåking av semi-naturlig eng (ASO) | SAMPLING_EVENT | 1606 | 0.50 |
| Overvåking av åpen grunnlendt kalkmark i Oslofjordområdet | OCCURRENCE | 1781 | 0.56 |
| Pl@ntNet automatically identified occurrences | OCCURRENCE | 3265 | 1.02 |
| Pl@ntNet observations | OCCURRENCE | 443 | 0.14 |
| Stabbetorp - Floristiske registreringer 2016 | OCCURRENCE | 1209 | 0.38 |
| Vascular Plant Herbarium, Oslo (O) UiO | OCCURRENCE | 1006 | 0.32 |
| Vascular Plant Herbarium, UiB | OCCURRENCE | 45 | 0.01 |
| Vascular Plants, Observations, Oslo (O) | OCCURRENCE | 192 | 0.06 |
| Vascular plant herbarium (KMN) UiA | OCCURRENCE | 299 | 0.09 |
| Vascular plant herbarium TRH, NTNU University Museum | OCCURRENCE | 659 | 0.21 |
| Vascular plant herbarium, The Arctic University Museum of Norway (TROM) | OCCURRENCE | 208 | 0.07 |
| Vascular plants in power line clearings and the nearby forest, southeast Norway | SAMPLING_EVENT | 269 | 0.08 |
| Vegetation data with and without experimental warming, alpine Finse 2000, 2004, 2011 | SAMPLING_EVENT | 148 | 0.05 |
| iNaturalist Research-grade Observations | OCCURRENCE | 6509 | 2.04 |
| naturgucker | OCCURRENCE | 28 | 0.01 |
7.3. Covariates
Code
#########################################################################
## Load bioclimatic variables
#########################################################################
#BIO1 = Annual Mean Temperature
if(!file.exists(paste0(dataFolder, "/Rasters/environmentalCovariates.tiff"))){
# convert crs to format accepted by sf, terra, and intSDM (& dependencies)
projCRS <- newCrs #sf::st_crs(crs)$proj4string
#download the data
#bioclim <- geodata::worldclim_country("NO", res = 10, var = "bio", path = paste0(dataFolder, "/Rasters"))
# Import the climatic variables
BioClimNorway <- rast(paste0(dataFolder, "/Rasters/wc2.1_country/NOR_wc2.1_30s_bio.tif"))%>%
terra::project(., projCRS)
# Create a 1km grid cell
OnekmCellID <- BioClimNorway[[1]]
names(OnekmCellID) <- "CellID"
OnekmCellID[] <- 1:ncell(OnekmCellID)
OnekmCellID <- mask(OnekmCellID,BioClimNorway[[1]])
# Extract the latitude covariate
LatitudeRast <- BioClimNorway[[1]]
LatitudeRast[] <- crds(LatitudeRast, na.rm = FALSE)[,2]
LatitudeRast <- mask(LatitudeRast,BioClimNorway[[1]])
names(LatitudeRast) <- "Latitude"
#Extract the longitude covariate
LongitudeRast <- BioClimNorway[[1]]
LongitudeRast[] <- crds(LongitudeRast, na.rm = FALSE)[,1]
LongitudeRast <- mask(LongitudeRast,BioClimNorway[[1]])
names(LongitudeRast) <- "Longitude"
# Put them together
BioClimsForSpeciesRegions <- c(BioClimNorway[[1]],
OnekmCellID,
LatitudeRast,
LongitudeRast)
names(BioClimsForSpeciesRegions)[1] <- "bio1"
# Extract the points within the spatial points
#source("pipeline/dataImport/importPollinatorDataset.R")
speciesDataImported <- readRDS(paste0(dataFolder, "/pollinatorDataFolder/speciesDataImported.RDS"))
PointsBeesAndButterfliesAndHoverfliesUsed <- speciesDataImported[["species"]]%>%
do.call("rbind", .)%>%
sf::st_as_sf(., coords = c('X', 'Y'), crs = newCrs)%>%
dplyr::select(acceptedScientificName, Taxon)
rm("speciesDataImported")
gc()
#PointsBeesAndButterfliesAndHoverfliesUsed <- PointsBeesAndButterfliesAndHoverflies$mergedDatasetPO%>%
# st_transform(newCrs)%>%
# vect()
#Extract values for the Bioclims and PointsBeesAndButterfliesAndHoverflies
climaticVarsExtracted <- terra::extract(BioClimsForSpeciesRegions,
PointsBeesAndButterfliesAndHoverfliesUsed)%>%
cbind(., st_coordinates(st_as_sf(PointsBeesAndButterfliesAndHoverfliesUsed)))%>%
st_as_sf(., coords = c('X', 'Y'), crs = projCRS)%>%
st_join(st_as_sf(PointsBeesAndButterfliesAndHoverfliesUsed), .,join = st_nearest_feature, left = TRUE)
# Calculate the mean and SD of the bio1
Bio1Traits <- aggregate(bio1~acceptedScientificName+Taxon,
climaticVarsExtracted,
mean)
names(Bio1Traits)[3] <- "mBio1"
Bio1Traits$sdBio1 <- aggregate(bio1~acceptedScientificName+Taxon,
climaticVarsExtracted,
sd)[,3]
vals <- Bio1Traits$sdBio1
vals[is.na(vals)] <- 0
Bio1Traits$sdBio1 <- vals
Bio1Traits$mLat <- aggregate(Latitude ~ acceptedScientificName+Taxon,
climaticVarsExtracted,
mean)[,3]
Bio1Traits$sdLat <- aggregate(Latitude ~ acceptedScientificName+Taxon,
climaticVarsExtracted,
sd)[,3]
vals <- Bio1Traits$sdLat
vals[is.na(vals)] <- 0
Bio1Traits$sdLat <- vals
# Community estimates for each 1 Km grid
Bio1Traits$RegCom <-aggregate(CellID~acceptedScientificName+Taxon,
climaticVarsExtracted,
function(x)length(unique(x)))[,3]
roundFunction <- function(x, digits) round(x, 1)
Bio1Traits <- Bio1Traits %>%
mutate_at(c("mBio1", "sdBio1", "mLat", "sdLat", "RegCom"), scale)#%>%
#mutate_at(c("mBio1", "sdBio1", "mLat", "sdLat", "RegCom"), roundFunction)
write.csv(Bio1Traits, file = paste0(dataFolder, "/bioTraits.csv"))
# We randomize the traits for use in cross validation.
# We create some background locations for this purpose
BackgroundBioClim <- BioClimsForSpeciesRegions[]
BackgroundBioClim <- data.frame(BackgroundBioClim[complete.cases(BackgroundBioClim),])
BackgroundBioClim$Lat <- BackgroundBioClim$Latitude
#BackgroundBioClim$Occurrence <- 0
#head(BackgroundBioClim)
RandomSampleBioClimTraits <- Bio1Traits[sample(1:nrow(Bio1Traits[,3:7]),
nrow(BackgroundBioClim),
replace = TRUE),
3:7]
# Merge these estimates with the climatic variables extracted
climaticVarsExtractedNew <- merge(climaticVarsExtracted,
Bio1Traits ,
by = c("acceptedScientificName", "Taxon"))%>%
na.omit()%>%
select(-c(acceptedScientificName, Taxon, ID, Longitude))
# Climatic variables with randomised covariates
climaticVarsExtractedRandomized <- cbind(BackgroundBioClim,
RandomSampleBioClimTraits)%>%
na.omit()%>%
as.data.frame() %>%
st_as_sf(., coords = c('Longitude', 'Lat'), crs = projCRS)#%>%
#rbind(., climaticVarsExtracted)
# COnvert the sf object to a raster
# climaticVarsExtracted <- climaticVarsExtracted%>%
# st_as_sf()%>%
# rast()
# define region to download as bounding box of buffered and projected mesh/regionGeometry
regionGeometryBuffer <-st_union(if(exists("meshForProject")) {
meshForProject |>
inlaMeshToSf()
}
else regionGeometry) |>
st_buffer(200) |>
st_transform(projCRS) |>
st_bbox() |>
st_as_sfc() |>
st_segmentize(dfMaxLength = 100) |>
vect()
# define working project raster
baseRaster <- terra::rast(extent = ext(regionGeometryBuffer),
res = c(1, 1),
crs = projCRS)
# rasterise regionGeometry
regionGeometryRast <- regionGeometry |>
st_as_sf() |>
st_transform(projCRS) |>
vect() |>
terra::rasterize(baseRaster, FUN = "mode")
#Write a funtion to perform the rasterisation
# rasterFunction <- function (x) terra::rasterize(climaticVarsExtracted,
# regionGeometryRast,
# x,
# FUN = "sum",
# background = NA)
myMax <- function(x){
max(x, na.rm = TRUE)
}
# Creating the environmental covariates we need to fit the models
# Trying to create the them as the same resolution with the bioclim raster
# climaticVarsExtractedWithoutBioclimVars <- climaticVarsExtractedNew
#climaticVarsExtractedWithoutBioclimVarsRandomized <- climaticVarsExtractedRandomized#[-c(1:3)]
# Create a function to rasterise the covariates
rasterFunction <- function (i, sfObject){
terra::rasterize(x = st_coordinates(sfObject),
#y = regionGeometryRast,
y = BioClimNorway,
values = c( sfObject[[i]]),
FUN = "mean")#,
#update = TRUE, cover = TRUE)
}
# Create a function to rasterise the covariates by filling the missing values
# mbio1 and mLats with its sd counterparts have missing values for the raster
# created for the model in pointedSDMs.
# To deal with this, we use the inlabru's missing value function to fill the
# missing values
rasterFunctionWithMissingValues <- function(oldRaster){
# create an sf object with the coordinates of the bioclim vars
sfObject <- st_as_sf(as.data.frame(crds(BioClimsForSpeciesRegions)),
coords = c("x", "y"),
crs = projCRS
)
# use the coordinates and the oldRaster with missing values
# to get the new raster
fillingMissingValues <- covariateMissingValues(sfObject, 1, "last", oldRaster)
# Create a rester with the new values
terra::rasterize(x = st_coordinates(sfObject),
y = BioClimsForSpeciesRegions,
values = c(fillingMissingValues),
FUN = "mean")
}
# library(tidyverse)
# library(sf)
# library(stars)
#rt <- st_rasterize(climaticVarsExtracted %>% dplyr::select(Latitude, geometry))
# Create rasters for environmental covariates
foo_sf <- list()
for(i in 1:(length(climaticVarsExtractedNew)-1)){
x <- names(climaticVarsExtractedNew)[i]
oldRaster <- rasterFunction(x, climaticVarsExtractedNew)
foo_sf[[i]] <- rasterFunctionWithMissingValues(oldRaster)
}
# Create rasters for randomised covariates
foo_sf_random <- list()
for(i in 1:(length(climaticVarsExtractedRandomized)-1)){
x <- names(climaticVarsExtractedRandomized)[i]
oldRaster <- rasterFunction(x, climaticVarsExtractedRandomized)
foo_sf_random[[i]] <- rasterFunctionWithMissingValues(oldRaster)
}
# Add the other BioClim variables to the new raster
fooBioclim_sf <- fooBioclim_sf_randomized <- list()
namesOfBioClimVars <- c("bio1", "CellID", "Latitude")
for(i in 1:length(namesOfBioClimVars)){
fooBioclim_sf[[i]] <- BioClimsForSpeciesRegions[[i]]
fooBioclim_sf_randomized[[i]] <- BioClimsForSpeciesRegions[[i]]
}
# Now we load land corine cover and habitat heterogeneity
if(!file.exists(paste0(dataFolder, "/Rasters/landCoverClassification.tiff"))){
# Note that the function corineReclassification will require you
# to open the ZIP file with the land corine dataset.
# For this work, the file is located at the Data/50599
landCover <- corineReclassification()%>%
terra::project(., projCRS)
# For some reason, NaNs are added as factors in the model,
# so I set that as water bodies since
levels(landCover)[[1]][,2][is.na(levels(landCover)[[1]][,2])] <- "Water bodies"
levels(landCover)[[1]][,2][28] <- "Moors and heathland"
#values(landCover)[,1][is.nan(values(landCover)[,1])] <- 48
# save the land cover covariate
# It takes a long time to finish the projection
writeRaster(landCover,
paste0(dataFolder, "/Rasters/landCoverClassification.tiff"),
overwrite=TRUE)
} else {
# Load the already formatted copy
landCover <- rast(paste0(dataFolder, "/Rasters/landCoverClassification.tiff"))
}
soilPh <- rast(paste0(dataFolder, "/Rasters/soil_ph.tiff"))%>%
terra::project(., projCRS)
soilCoarseFraction <- rast(paste0(dataFolder, "/Rasters/soil_coarse_fraction.tiff"))%>%
terra::project(., projCRS)
soilMoisture <- rast(paste0(dataFolder, "/Rasters/soil_moisture.tiff"))%>%
terra::project(., projCRS)
soilOrganicCarbon <- rast(paste0(dataFolder, "/Rasters/soil_organic_carbon.tiff"))%>%
terra::project(., projCRS)
# Put all the lists together
allFoo_sf <- c(fooBioclim_sf,
foo_sf[-c(1:3)],
landCover,
soilPh,
soilCoarseFraction,
soilOrganicCarbon,
soilMoisture)
allFoo_sf_random <- c(fooBioclim_sf_randomized,
foo_sf_random[-c(1:3)],
landCover,
soilPh,
soilCoarseFraction,
soilOrganicCarbon,
soilMoisture)
# set the names to use for the raster
namesToUse <- c(#namesOfBioClimVars,
names(climaticVarsExtractedNew)[-length(names(climaticVarsExtractedNew))],
"landCover",
"soilPh",
"soilCoarseFraction",
"soilOrganicCarbon",
"soilMoisture")
# Put all the parameters together
#z <- 0
parametersCropped <- allFoo_sf |>
lapply(function(x) {
#print(z +1)
#print(is.raster(x))
regionExt <- as.polygons(terra::project(regionGeometryBuffer, x), extent = TRUE)
# Crop each covariate to extent of regionGeometryBuffer
out <- terra::crop(x, regionExt, snap = "near")
# Project all rasters to baseRaster and combine
if(is.factor(x)) {
if("Water bodies" %in% levels(out)[[1]][,2]){
values(out)[,1][is.nan(values(out)[,1])] <- 48
levels(out) <- levels(x)
}
# project categorical rasters
out <- terra::project(out, baseRaster, method = "mode")
levels(out) <- levels(x) # reassign levels
out
} else {
# project & scale continuous rasters
ifel(is.na(regionGeometryRast), NA,
terra::project(out, baseRaster)) |>
scale()
}
}) |>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(namesToUse)
parametersCroppedRandomized <- allFoo_sf_random |>
lapply(function(x) {
#print(z +1)
#print(is.raster(x))
regionExt <- as.polygons(terra::project(regionGeometryBuffer, x), extent = TRUE)
# Crop each covariate to extent of regionGeometryBuffer
out <- terra::crop(x, regionExt, snap = "near")
# Project all rasters to baseRaster and combine
if(is.factor(x)) {
# I make sure that NaNs are not part of the land Cover values
if("Water bodies" %in% levels(out)[[1]][,2]){
values(out)[,1][is.nan(values(out)[,1])] <- 48
levels(out) <- levels(x)
}
# project categorical rasters
out <- terra::project(out, baseRaster, method = "mode")
levels(out) <- levels(x) # reassign levels
out
} else {
# project & scale continuous rasters
ifel(is.na(regionGeometryRast), NA,
terra::project(out, baseRaster)) |>
scale()
}
}) |>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(namesToUse)
#plot(parametersCropped)
#plot(parametersCroppedRandomized)
# Define the covariates needed
#r <- disagg(parametersCropped, fact = 30, method = "bilinear")
#parameters <- parametersCropped
# The prediction data is in a bounded box, and for landCover, we have values within
# the entire bounded box. We need to mask the covariates by the regionGeometry
parameters <- parametersCropped
parameters <- regionGeometry %>%
st_transform(., newCrs)%>%
vect( )%>%
mask(parameters, .)
# define geometries to combine with prediction
geometries <- xyFromCell(parameters , seq(ncell(parameters))) %>%
as.data.frame() %>%
st_as_sf(coords = c("x", "y"), crs = newCrs)
pars <- parameters %>%
as.data.frame(na.rm = FALSE)%>%
select(bio1)%>%
bind_cols(geometries)%>% # transform the prediction data to the units used in model runs
filter(rowSums(is.na(.)) != (ncol(.)-1 ))
PolyBio1 <- poly(pars$bio1, 2)[,1]
PolyBio2 <- poly(pars$bio1, 2)[,2]
pars <- pars %>%
dplyr::mutate(polyBio1 = PolyBio1,
polyBio2 = PolyBio2) %>%
st_sf()%>%
st_transform(., newCrs)
spPred <- rasterize(pars, baseRaster, names(pars)[-length(pars)])
#parametersCropped
parameters$polyBio1 <- spPred$polyBio1
parameters$polyBio2 <- spPred$polyBio2
parameters$polyBio1Mbio <- spPred$polyBio1 * parameters$mBio1
parameters$polyBio2Mbio <- spPred$polyBio2 * parameters$mBio1
parameters$bio1sq <- (parameters$bio1)^2
parameters$bmInt <- parameters$bio1 * (parameters$mBio1)
parameters$bmIntsq <- (parameters$bio1)^2 * parameters$mBio1
parameters$sdmInt <-parameters$sdBio1 * parameters$mBio1
parameters$mlatInt <- parameters$Latitude * parameters$mLat
environmentalCovariates <- parameters
# save the covariates
writeRaster(environmentalCovariates,
paste0(dataFolder, "/Rasters/environmentalCovariates.tiff"),
overwrite=TRUE)
# Calculate the quadratic for the randomized covariates
parameters <- parametersCroppedRandomized
#parametersCropped
parameters$polyBio1 <- spPred$polyBio1
parameters$polyBio2 <- spPred$polyBio2
parameters$polyBio1Mbio <- spPred$polyBio1 * parameters$mBio1
parameters$polyBio2Mbio <- spPred$polyBio2 * parameters$mBio1
parameters$bio1sq <- (parameters$bio1)^2
parameters$bmInt <- parameters$bio1 * (parameters$mBio1)
parameters$bmIntsq <- (parameters$bio1)^2 * parameters$mBio1
parameters$sdmInt <- parameters$sdBio1 * parameters$mBio1
parameters$mlatInt <- parameters$Latitude * parameters$mLat
environmentalCovariatesRandomized <- parameters
# save the covariates
writeRaster(environmentalCovariatesRandomized,
paste0(dataFolder, "/Rasters/environmentalCovariatesRandomized.tiff"),
overwrite=TRUE)
} else {
environmentalCovariates <- rast(paste0(dataFolder, "/Rasters/environmentalCovariates.tiff"))
environmentalCovariatesRandomized <- rast(paste0(dataFolder, "/Rasters/environmentalCovariatesRandomized.tiff"))
}We used several covariates to fit both the pollinator and plant ISDMs. The names of the covariates, their description and source are presented in Table 8. Additionally, we include the indication of whether the covariate is a habitat, trait or climatic variables as well as which of the ISDMs it was used to fit in Table 8. The rasters of the covariates are on \(1\) km resolution. For the annual temperature, we create a raster of the orthogonal polynomial of degree \(2\) (using the poly function in the stats R-package) and use them to fit the pollinator distribution instead of the original annual temperature values. The soil covariates were obtained from Poggio et al. (2021).
| Name | Description | Source | Habitat/ Trait/ Climatic | Disribution |
|---|---|---|---|---|
| bio1 | Annual mean temperature | geodata R-package | Climatic | Plant |
| polyBio1 | First column from orthogonal polynomial of annual mean temperature to degree of 2 | Calculated from bio1 raster | Climatic | Pollinator |
| polyBio2 | Second column from orthogonal polynomial of annual mean temperature to degree of 2 | Calculated from bio1 raster | Climatic | Pollinator |
| Latitude | Latitudinal gradient extracted from the bio1 raster | Extracted from bio1 raster | Trait | Pollinator |
| mBio1 | Mean pollinator annual temperature per each pollinator species | Calculated from bio1 raster and data downloaded from GBIF | Trait | Pollinator |
| sdBio1 | Standard deviation of pollinator annual temperature per each pollinator species | Calculated from bio1 raster and data downloaded from GBIF | Trait | Pollinator |
| mLat | Mean pollinator latitude per each pollinator species | Calculated from latitude raster and data downloaded from GBIF | Trait | Pollinator |
| sdmInt | Interraction between sdBio1 and mBio1 | NA | NA | Pollinator |
| poly1bmInt | Interraction between mBio1 and polyBio1 | NA | NA | Pollinator |
| poly2bmInt | Interraction between mBio1 and polyBio2 | NA | NA | Pollinator |
| landCover | Land cover raster | Land corine data | Habitat | Pollinator, Plant |
| soilPh | Ph of soil | Poggio et al. (2021) | Habitat | Plant |
| soilCoarseFraction | Soil coarse fraction | Poggio et al. (2021) | Habitat | Plant |
| soilOrganicCarbon | Soil orgainic carbon | Poggio et al. (2021) | Habitat | Plant |
| soilMoisture | Soil moisture | Poggio et al. (2021) | Habitat | Plant |
We present the distribution of the covariates described in Table 8 here in Figure 11.
To ascertain the trait effect on the pollinator distribution, we randomise the triat covariates in Table 8. These randomised covariates (displayed in Figure 12) are used to fit ISDMs, and compared with the ISDMs fitted with covariates in Figure 11 .
8. Spatial units
We describe the spatial units of our analysis based on the UN System of Environmental Accounting (SEEA EA; see Rahayu et al. (2024) and structure of ecosystem accounting). The Norwegian territory defines our ecosystem accounting area (EEA) and the vegetation zones (described by feltkurskompendium biologisk mangfold terrestrisk biologi and store norske leksikon) define the ecosystem types and our ecosystem assets (EA) are defined by the geo-climatic regions (described in Section 1.4.1). We also use a gridded EA (500 x 500 m) in describing our basic spatial units (BSU).
9. Analyses
We fit an ISDM using the point process framework (Illian et al. 2008) to our processed datasets described in Section 1.7. ISDMs are various observation models that are linked together by a shared ecological process (Isaac et al. 2020; Dorazio 2014; Fithian et al. 2015).
9.1. General overview of ISDMs
In this subsection, we present an overview of the ISDMs for general understanding of this document. For further details on ISDM, refer to Isaac et al. (2020), Dorazio (2014) and Fithian et al. (2015). The models are described using the point process framework as described in Adjei et al. (2023). Additionally, we present the description of these models under the assumption that we are fitting a multi-species ISDM using presence-only and presence-absence data.
In this subsection, we describe the framework by defining a “multi-group” model, where the group can be on any taxonomic level such as species, genus, order, etc. In Section 1.9.2 and Section 1.9.3, we will define what the group represent in the ISDMs fitted there.
We model the presence-only data as a Poisson point process model (Illian et al. 2008) with mean intensity \(\lambda_i(s)\) for each group \(i = 1, \ldots, S\), where \(S\) is the number of groups. This mean intensity modelled as: \[ \ln(\lambda_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) + \eta(s), \tag{1}\] where \(\beta_{0i}\) is the group-specific intercept, \(\beta_{ji}\) is the group-specific effect of covariate \(j\), \(X_j(s)\) is the \(j^{th}\) covariate field, \(\omega_i(s)\) is the group-specific spatial autocorrelation field and \(\eta(s)\) is the bias field. \(\omega_i(s)\) and \(\eta(s)\) are modeled as zero mean Gaussian random field (i.e. \(\omega_i(s) \sim N(0, \Sigma)\), where \(\Sigma\) is a Mat’ern covariance matrix with variance \(\sigma_{i}^2\) and range \(\kappa_i\) and \(\eta(s) \sim N(0, \Sigma_\eta)\), where \(\Sigma_\eta\) is a Mat’ern covariance matrix with variance \(\tau^2\) and range \(\kappa_\eta\)). Unless otherwise stated, we will assume that all the \(S\) groups share the same parameters in the covariance matrix (\(\Sigma\)).
We model the presence-absence data with a logistic regression model. Let \(y_i(s)\) be a binary value at location \(s\) with \(0\) indicating absence of group \(i\) and \(1\) indicating presence of group \(i\). Then \(y_i(s) \sim \text{Bernoulli}(\Psi_i(s))\) with: \[ cloglog(\Psi_i(s)) = \beta_{0i} + \beta_{ji} X_j(s) + \omega_i(s) \tag{2}\] where the parameters \(\beta_0\), \(\beta_1\) and \(\omega\) are defined in Equation 1. We are assuming in Equation 2 that the presence-absence data is not biased.
All the ISDMs were fitted using the R-package PointedSDMs (Mostert and O’Hara 2023). PointedSDMs is a wrapper around inlabru (Bachl et al. 2019), which used the integrated nested Laplace approximation method (implemented in the R-package INLA; Rue, Martino, and Chopin (2009)) to fit hierarchical models that can be expressed as linear Gaussian models. The reader is referred to Zuur (2017), Gómez-Rubio (2020), Blangiardo and Cameletti (2015) for further reading.
To be able to fit the model, the study region is discretised by creating a mesh over the surface. The mesh we used in this study is presented in Figure 13.
9.2. ISDM for pollinators
Using the model framework defined in Section 1.9.1, we fit an ISDM to the pollinator dataset obtained from GBIF using the order: Bees, Butterflies and Hoverflies (use their right name here) as the group. We use latitude, annual mean temperature (bio1), trait mean temperature (mBio1), trait standard deviation temperature (sdBio1), trait mean Latitude (mLat), traut standard deviation of latitude (sdLat), annual mean temperature squared and interaction between annual temperature and trait mean temperature (bmInt) as covariates.
Code
# Load data and covariates
source("pipeline/dataImport/importPollinatorFromGBIF.R")
source("pipeline/dataImport/environmentalDataImport.R")
# Set the model
model <- PointedSDMs::startSpecies(PointsBeesAndButterfliesAndHoverflies, # list of pollinator dataset (containing both mergedDatasetPA and mergedDatasetPO)
Boundary = regionGeometry, # boundary of the study
Projection = newCrs, # projection
Mesh = meshForProject, #mesh used for the model
speciesEnvironment = TRUE, # Should we have pollinator specific covariate effect
speciesIntercept = TRUE, # TRUE treats the intercept as a random effect, instead of constrained as with a fixed effect
pointsIntercept = FALSE, #should we include intercept for each dataset
pointCovariates = FALSE, #do we have covariates for the presence-only data
responsePA = 'individualCount', # column name of the response values for presence-absence data
# trialsPA = 'trials',
spatialCovariates = envCovsForModel, # raster of spatia covariates
speciesName = 'Taxon', # Names of the species in the datasets
pointsSpatial = NULL, # Do not include a dataset spatial field
speciesSpatial = "replicate") # unique spatial field per species, but they share the same hyperparameters.
# Add second bias field for PO data
model$addBias(datasetNames = 'mergedDatasetPO')9.2.1 Priors
We assume the following priors for the pollinator ISDM:
The probability that the standard deviation of the pollinator spatial field is greater than \(0.5\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.5) = 0.01\)).
The probability that the standard deviation of the bias field is greater than \(1\) is \(0.01\) (i.e. \(P(\sigma_\eta > 1) = 0.01\))
The probability that the spatial range of the pollinator spatial field is less than \(1\)km is \(0.01\) (i.e \(P(\kappa_\omega < 1) = 0.01\))
The probability that the spatial range of the pollinator spatial field is less than \(1\)km is \(0.1\) (i.e. \(P(\kappa_\eta < 1) = 0.1\))
effects of continuous covariates (all covariates except land cover) and the intercept for each pollinator is assumed to be from a Normal distribution with mean \(0\) and precision of \(0.01\) and Normal distribution with mean \(0\) and precision of \(2\) respectively.
Code
# hyper parameters of the spatial field (shared across species)
for(taxon in c("Bees", "Butterflies", "Hoverflies")){
# hyper parameters of the spatial field (shared across species)
model$specifySpatial(Species = taxon, # define same prior for the all species
prior.sigma = c(0.5, 0.01), # SD of field variation;
prior.range = c(1, 0.01), # range of spatial correlation;
constr = FALSE
)
# PC prior for intercepts
model$priorsFixed(Effect = "intercept", # define same prior for the all species
Species = taxon,
mean.linear = 0,
prec.linear = 2
)
model$changeComponents(paste0("", taxon, "_bio.sq(main = ", taxon, "_bio.sq, model = ", '"', "clinear", '"', ", range = c(-10, 0))"))
model$changeComponents(paste0("", taxon, "_landCover(main = ", taxon, "_landCover, model = ", '"', "factor_contrast", '"', ", hyper = list(theta = list(initial = 0, fixed = TRUE)))"))
}
# Prior for bias field
model$specifySpatial(Bias = 'mergedDatasetPO', # Change the prior
prior.sigma = c(1, 0.01),
prior.range = c(1, 0.01))
# prior for random effects (mesh nodes of spatial field and species intercepts)
model$specifyRandom(
# precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
speciesGroup = list(model = "iid",
hyper = list(prec = list(initial = 0, fixed = TRUE))), # InitialValue of sigma^2 = 1/exp(1) = 0.3
# precision parameter on the baseline species occurrence rate
speciesIntercepts = list(initial = 0,
fixed = TRUE),
copyBias = list(model = "iid",
hyper = list(prec = list(initial = 0, fixed = TRUE)))#list(beta = list(initial = 0, fixed = TRUE))
)
model$changeComponents(paste0("mergedDatasetPO_biasField(main = geometry, model = mergedDatasetPO_bias_field,control.group = list(model =", '"', "iid", '"' , ", hyper = list(prec = list(initial = 0, fixed = TRUE))))"))
# Specify priors for covariate effects (continuous)
covariatesToAddEffects <- c("bio1", "bmInt", "RegCom", "mBio1", "sdBio1", "sdmInt", "Latitude", "mLat", "sdLat", "mlatInt")
#covariatesToAddEffects <- c("bio1","Latitude")
for(covs in covariatesToAddEffects){
model$priorsFixed(Effect = covs,
mean.linear = 0,
prec.linear = 0.01)
}9.2.2 Fit model and make predictions
We fit the model by using the Gaussian strategy of the empirical Bayes integration strategy in R-package INLA (Rue, Martino, and Chopin 2009). Note that the model fit returns effect values for each pollinator taxon.
Code
modelOptions <- list(num.threads = 4,
control.inla = list(int.strategy = 'eb',
diagonal = 0.001,
cmin = 0,
strategy = "gaussian",
control.vb= list(enable = FALSE)),
verbose = FALSE,
safe = TRUE,
inla.mode = "experimental")
modelRun <- PointedSDMs::fitISDM(data = model,
options = modelOptions)
# predictions for this model
individualDatasetPreds <- predict(modelRun,
data = ppxl,
predictor = TRUE,
n.samples = 100)Next, we cluster all the over \(8000\) pollinator species into groups using a clustering algorithm. We explored using k-means and hierarchical clustering method. K-means clustering provides a simple approach for partitioning a data set into \(K\) distinct non-overlapping clusters such that the within-cluster-variation is small as possible (Hastie, Tibshirani, and Friedman 2009). We need to choose the number of cluster and the number of starts for the algorithm, which can have an impact on the results. The cluster is defined by the trait annual temperature (mBio1) and latitude (mLat) displayed in Figure 11. We assess what the optimal choice of the number of clusters is by varying the number of clusters from \(1\) to \(10\) and choosing the cluster with the smallest within-cluster- variation. The optimal choice of the number of clusters is \(3\). Hence, we chose \(K = 3\) and \(n.start = 20\), after we checked the possible number of clusters that yields the smallest within-cluster-variation.
Alternatively, hierarchical clustering helps us explore the choice of \(K\) clusters through a dendogram. We then cut the tree at \(k = 3\). We present the three clusters from the two clustering algorithms in Figure 14.
Code
if(!file.exists(paste0(dataFolder, "/pollinatorDataFolder/bioTraits.csv"))){
# Calculate the mean and SD of the bio1
speciesDataImported <- readRDS(paste0(dataFolder, "/pollinatorDataFolder/speciesDataImported.RDS"))
speciesData <- speciesDataImported[["species"]]%>%
do.call("rbind", .)%>%
sf::st_as_sf(., coords = c('X', 'Y'), crs = newCrs)%>%
dplyr::select(acceptedScientificName, Taxon)
# Extract latitude and annual temperature
climaticVarsExtracted <- terra::extract(environmentalCovariates,
speciesData)%>%
dplyr::select(Latitude, bio1, landCover)%>%
cbind(., st_coordinates(st_as_sf(speciesData)))%>%
st_as_sf(., coords = c('X', 'Y'), crs = newCrs)%>%
st_join(st_as_sf(speciesData), .,join = st_nearest_feature, left = TRUE)
# Estimate trait values
Bio1Traits <- aggregate(bio1~acceptedScientificName+Taxon,
climaticVarsExtracted,
mean)
names(Bio1Traits)[3] <- "mBio1"
Bio1Traits$sdBio1 <- aggregate(bio1~acceptedScientificName+Taxon,
climaticVarsExtracted,
sd)[,3]
Bio1Traits$mLat <- aggregate(Latitude ~ acceptedScientificName+Taxon,
climaticVarsExtracted,
mean)[,3]
Bio1Traits$sdLat <- aggregate(Latitude ~ acceptedScientificName+Taxon,
climaticVarsExtracted,
sd)[,3]
#######################
# Cluster these data
#####################
n_clusters <- 10
# Initialize total within sum of squares error: wss
wss <- numeric(n_clusters)
set.seed(123)
n <- 10
# Look over 1 to n possible clusters
for (i in 1:n) {
print(i)
# Fit the model: km.out
km.out <- kmeans(Bio1Traits[, c("mLat", "mBio1")], centers = i, nstart = 20)
# Save the within cluster sum of squares
wss[i] <- km.out$tot.withinss
}
# Produce a scree plot
wss_df <- tibble(clusters = 1:n, wss = wss)
scree_plot <- ggplot(wss_df, aes(x = clusters, y = wss, group = 1)) +
geom_point(size = 4)+
geom_line() +
scale_x_continuous(breaks = c(2, 4, 6, 8, 10)) +
xlab('Number of clusters')+
geom_hline(
yintercept = wss,
linetype = 'dashed',
col = c(rep('#000000',2),'#FF0000', rep('#000000', 7))
)
plot(scree_plot)
# Fi the final model with 3 clusters
km.out <- kmeans(Bio1Traits[, c("mLat", "mBio1")], centers = 3, nstart = 1)
Bio1Traits$clusterID <- factor(km.out$cluster)
# Set the values of whether the species is temperate, alpine or boreal
Bio1Traits <- Bio1Traits %>%
dplyr::mutate(zone = ifelse(clusterID ==1 , "alpine",
ifelse(clusterID == 2, "temperate", "boreal")))
# Merge these estimates with the climatic variables extracted
biotraits <- merge(climaticVarsExtracted,
Bio1Traits ,
by = c("acceptedScientificName", "Taxon"))%>%
na.omit()
#save the results
write.csv(biotraits, file = paste0(dataFolder, "/pollinatorDataFolder/bioTraits.csv"))
#plot(biotraits$mBio1, biotraits$bio1)
} else {
biotraits <- readr::read_csv("Data/pollinatorDataFolder/bioTraits.csv")
}The aim is to assign the same predicted intensity (\(\lambda\)) to each species within the cluster. The means for each cluster is presented in Table 9. We name these three clusters as: 1 - temperate, 2 - boreal and 3 - alpine zones.
| Zones | Annual temperature | Latitude |
|---|---|---|
| Alpine | -0.359 | 0.994 |
| Temperate | 1.599 | -1.040 |
| Boreal | 0.716 | -0.449 |
We predict the intensity of pollinators within each zone, by assigning a fixed mBio1 and mLat value (with the value from Table 9 and properly estimating the interaction values as well) whilst keeping the other covariate rasters (see Figure 11) the same. Thus, we obtain nine (\(9\)) predictions maps foe each combination of taxon and zone. Each pollinator species within the combination of taxon and zone (Table 10) are assigned their respective predicted intensity .
| Pollinator species | Taxon | Cluster zones |
|---|---|---|
| Gillmeria pallidactyla (Haworth, 1811) | Butterflies | boreal |
| Dioryctria abietella (Denis & Schiffermüller), 1775 | Butterflies | alpine |
| Coleophora vacciniella Herrich-Schäffer, 1861 | Butterflies | boreal |
| Ypsolopha dentella (J.C.Fabricius, 1775) | Butterflies | alpine |
| Mycetophila strigatoides (Landrock, 1927) | Hoverflies | boreal |
| Diptera | Hoverflies | boreal |
| Clossiana euphrosyne (Linnaeus, 1758) | Butterflies | boreal |
| Psychoda phalaenoides (Linnaeus, 1758) | Hoverflies | boreal |
| Bolitophila bimaculata Zetterstedt, 1838 | Hoverflies | boreal |
| Aclista bitensis (Kieffer, 1909) | Bees | boreal |
| Platycheirus ramsarensis Goeldlin de Tiefenau, Maibach & Speight, 1990 | Hoverflies | boreal |
| Agriphila straminella (Denis & Schiffermüller), 1775 | Butterflies | alpine |
| Platypalpus agilis (Meigen, 1822) | Hoverflies | alpine |
| Amphipoea Billberg, 1820 | Butterflies | alpine |
| Noctua pronuba Linnaeus, 1758 | Butterflies | alpine |
9.2.3 Pollinator ISDM validation
Additionally, we perform validation to assess the trait effect on the pollinator distribution across the study region (Mostert and O’Hara 2023). We re-fit the ISDM in Section 1.9.2 using the randomised trait covariate (Figure 12. We compare the deviance information criterion (DIC) of the two models (the one fitted with the true covariates in Section 1.9.2 and the re-fitted model described here). The model with the lowest DIC score is deemed to be the best. If the model with the randomised covariates is chosen as the best, we can conclude that the trait covariates are not significant. If the model fitted in Section 1.9.2 is deemed best, we can conclude that the trait covariates are significant.
9.3. ISDM for plant species
There are \(54\) plant species we aim to estimate their species distribution. Looking at the number of occurrence records in Table 7, developing a multi-species ISDM will produce stable estimates for the species with fewer occurrence records like Hieracium umbellatum. However, we do not have the required memory allocations to fit all the \(54\) species together, so we split the total number of species into six groups, with nine species in each group. The splitting is done by sorting the total occurrence of each species in the ASO data and assigning the species to each group so that the highest occurrence is in the first group, the second is in the second group and the list continues.
Code
# PointedSDMs takes a longer time to fit for many species
# The trick to to break it into smaller groups
# The split will be done by the number of present speccies in each location
# I will split it nGroups time
sortedSpecies <- table(asoDatasf$acceptedScientificName)%>%
as.data.frame()%>%
dplyr::arrange(desc(Freq))%>%
select(Var1)%>%
c()
# Set the number of groups
nGroups <- 10
#split the species into groups of 10 species in each
plantSpeciesGroup <- split(sortedSpecies$Var1,
seq(1,
ceiling(length(sortedSpecies$Var1)/nGroups)))After splitting the data into the groups, we fit a multi-species ISDMs (described in Section 1.9.1) to each group.
Similar to the model the pollinator ISDM, we fitted an ISDM to the \(54\) plant species. Here, we use species as groups and soil ph, soil organic carbon, soil coarse fraction, land cover, soil moisture and annual temperature as covariates.
Code
speciesModelShared <- PointedSDMs::startSpecies(formatPlantData,
Boundary = regionGeometry,
Projection = newCrs,
Mesh = meshForProject,
responsePA = 'individualCount',
speciesEnvironment = TRUE,
speciesIntercept = TRUE,
spatialCovariates = envCovsForPlantSpeciesModel,
speciesName = 'simpleScientificName',
pointsIntercept = FALSE,
pointsSpatial = NULL, # Do not include a dataset spatial field
speciesSpatial = "replicate")
# Add bias spatial field for the PO dataset
speciesModelShared$addBias(datasetNames = 'mergedPlantsPO')9.3.1 Priors
We assume the following priors for the plant species ISDM:
The probability that the standard deviation of the plant species spatial field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\omega > 0.1) = 0.1\)).
The probability that the standard deviation of the bias field is greater than \(0.1\) is \(0.1\) (i.e. \(P(\sigma_\eta > 0.1) = 0.1\))
The probability that the spatial range of the plant species spatial field is greater than \(50\) is \(0.1\) (i.e \(P(\kappa_\omega < 50) = 0.1\))
The probability that the spatial range of the plant species spatial field is greater than \(50\) is \(0.1\) (i.e. \(P(\kappa_\eta < 50) = 0.1\))
Code
# hyper parameters of the spatial field (shared across species)
speciesModelShared$specifySpatial(Species = TRUE, # define same prior for the all species
prior.sigma = c(1, 0.1),
prior.range = c(5, 0.1))
speciesModelShared$specifySpatial(Bias = TRUE, # Change the prior
prior.sigma = c(0.6, 0.1),
prior.range = c(5, 0.1))
# prior for random effects (mesh nodes of spatial field and species intercepts)
speciesModelShared$specifyRandom(
# precision parameter on how much each species' spatial field (how much they can deviate from the shared ___)
speciesGroup = list(model = "iid",
hyper = list(prec = list(prior = "pc.prec",
param = c(0.1, 0.1)))),
# precision parameter on the baseline species occurrence rate
speciesIntercepts = list(prior = 'pc.prec',
param = c(0.1, 0.1))) 9.3.2 Model fitting and predictions
We fit the model by using the adaptive strategy of the composite design integration strategy in INLA. We predict the occurrence probability for each plant species by transforming the plant species with the inverse logit function.
Code
# specify the model options for INLA
modelOptions <- list(num.threads = 4,
control.inla = list(int.strategy = 'ccd',
diagonal = 0.001,
cmin = 0,
strategy = "adaptive",
control.vb= list(enable = FALSE)),
verbose = FALSE,
safe = TRUE,
inla.mode = "experimental")
# Species-specific spatial effects model
speciesSharedEst <- PointedSDMs::fitISDM(data = speciesModelShared,
options = modelOptions)
# I proceed with the prediction of occupancy probabilities
# over the entire region
individualDatasetPreds <- predict(speciesSharedEst,
data = ppxl,
predictor = TRUE,
n.samples = 500)9.4 Species Interractions
We visualize the bipartite networks between the pollinator taxon and the plant species using data from Rasmussen et al. (2021) for hoverflies, Museum (2023a) and Museum (2023b) for butterflies and Robinson et al. (2023) for bees.
Code
# Interraction web
# Import and format the pollinator datasets
if(!file.exists(paste0(dataFolder, "/interractionsData/interractionMatrix.csv"))){
FinalSelectionOfLepidopteraAndBeesAndHoverflies <- allPollinatorsDataset(dataFolder,
print = FALSE)
Bio1Traits <- read.csv(paste0(dataFolder,"/bioTraits.csv"))
NiNIndicators <- read.csv(paste0(dataFolder, "/interractionsData/Indikatorarter NiN eng.csv"), sep = ";")
NiNIndicatorsGenera <- do.call(rbind,lapply(unique(NiNIndicators$Genus),function(x)NiNIndicators[NiNIndicators$Genus %in% x,][1,]))
FinalSelectionOfLepidopteraAndBeesAndHoverflies$PlantGenusPollinator <- with(FinalSelectionOfLepidopteraAndBeesAndHoverflies, paste(PlantGenus,ValidSpeciesName))
OneInteractionPerPlantGenusPollinator <- do.call(rbind,lapply(unique(FinalSelectionOfLepidopteraAndBeesAndHoverflies$PlantGenusPollinator),function(x)FinalSelectionOfLepidopteraAndBeesAndHoverflies[FinalSelectionOfLepidopteraAndBeesAndHoverflies$PlantGenusPollinator%in% x,][1,]))
InteractionsInIdealMeadow <- OneInteractionPerPlantGenusPollinator[OneInteractionPerPlantGenusPollinator$PlantGenus %in% NiNIndicatorsGenera$Genus,]
InteractionsInIdealMeadowWithTraits <- merge(InteractionsInIdealMeadow,
Bio1Traits,
by.x= "ValidSpeciesName",
by.y = "validScien")
# FInd out the species in the aso data that match up to the specific genus
if(!exists("asoDatasf")) asoDatasf <- readRDS(file = paste0(dataFolder, "/plantDataFolder/formattedData/asoData.RDS"))
# Merge the asoData genus to the species data
asoDatasf <- asoDatasf%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)
)
interractionData <- merge(InteractionsInIdealMeadowWithTraits,
as.data.frame(asoDatasf[, c("simpleScientificName", "genus")])[, 1:2],
by.x = "PlantGenus",
by.y = "genus")
# save the interractions in ideal meadow
write.csv(InteractionsInIdealMeadowWithTraits,
paste0(dataFolder, "/interractionsData/InteractionsInIdealMeadowWithTraits.csv"))
write.csv(interractionData,
paste0(dataFolder, "/interractionsData/interractionData.csv"))
# Create the plant pollinator for network
#PlantPollinatorsForNetwork <- InteractionsInIdealMeadowWithTraits[c("ValidSpeciesName","PlantGenus")]
PlantPollinatorsForNetwork <- interractionData[c("Taxon.x","simpleScientificName")]
names(PlantPollinatorsForNetwork)[1:2] <- c("higher","lower")
PlantPollinatorsForNetwork$freq <- 1
PlantPollinatorsForNetwork$webID <- 1
dfForWeb <- PlantPollinatorsForNetwork[c("lower","higher","webID","freq")]
# Create webs from dataframe
WebReady <- bipartite::frame2webs(dfForWeb, varnames = c("lower", "higher", "webID", "freq"))
# save for later use
save(WebReady,
file = paste0(dataFolder, "/interractionsData/webPlot.RData"))
if(modelPlots){
plotweb(WebReady[[1]])
}
# Calculte the interraction matrix
interractionMatrix <- as.data.frame(WebReady[[1]])
interractionMatrix$species <- rownames(interractionMatrix)
rownames(interractionMatrix) <- NULL
# save the results
write.csv(interractionMatrix,
paste0(dataFolder, "/interractionsData/interractionMatrix.csv"))
} else {
load(paste0(dataFolder,"/interractionsData/webPlot.RData"))
interractionMatrix <- readr::read_csv(paste0(dataFolder,"/interractionsData/interractionMatrix.csv"))
if(modelPlots){
plotweb(WebReady[[1]])
}
}
# Calculate the
interractionProb <- interractionMatrix[, 2:4]%>%
proportions(.)%>%
dplyr::mutate(species = interractionMatrix$species)%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(species, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
),
# Replace space with underscore in simpleScientificName
species = gsub("-", "", gsub("C","", gsub(" ", "_", simpleScientificName)))
)
colnames(interractionProb)[2] <- "Butterflies"The data we got only had host plats for \(240\) out of the \(8000\) (\(3\%\)) pollinator species modeled here. To use information on the pollinator host in describing the ecosystem condition, we use aggregate the pollinator species to the order level as shown in Figure 15. Using that information, we create a network using the frame2webs function in the R-package bipartite (Dormann, Gruber, and Fruend 2008).
The values from the network for the interaction between the pollinator taxon and plant hosts are stored in a data frame such that each row represents the weight of each pollinator taxon for a given plant host. We then scale these values for each plant host such that their sum is \(1\), and use these values as weights in describing the pollinator diversity in Section 1.9.5.
9.5. Community diversity
At the habitat level, diversity is the most widely used component in describing the characteristics of a community (Thukral 2017). The alpha diverity has two components: species richness and equitability indices (Thukral 2017). A community with equal number of different species within the community will have a higher eveness index; a community dominated by one or few species in terms of the number of individuals will have a lower eveness index (Thukral 2017). A simple explanation to the diversity indices are presented here.
We use the Hill numbers, such as species richness, Shannon entropy and Simpson index, to describe the community diversity (Jost 2007, 2006; Thukral 2017). The Hill numbers are defined as (Hill 1973): \[ {}^qD = \bigg(\sum_{i = 1}^S p_i^q \bigg)^{\frac{1}{1-q}}, \tag{3}\] where \(p_i\) is the relative abundance of pollinator species \(i\) with order \(q\). The order of the diversity (\(q\)) indicates the sensitivity to common and rare species, with greater values of \(q\) (specifically \(q > 1\)) indicates a disproportional favor to common species, and lower values of \(q\) (specifically \(q < 1\)) indicates a disproportional favor to rare species. Hill numbers with order \(0\) is the species richess, Hill numbers with order \(1\) is the Shannon index and Hill numbers with order \(2\) is the Simpson index (Thukral 2017).
The relative abundance within each prediction grid cell \(s\) (\(p_i(s)\)) for Equation 3 is estimated as:
\[ \begin{split} \text{Relative abundance of pollinator species} &\propto \text{Estimated pollinator intensity} \\ &\times \text{Interraction weight} \\ & \times \text{plant species occurrence probability}\\ \implies p_i(s) &= \frac{ \frac{1}{K} \sum_k \lambda_i(s) \times w_{ik} \times \Psi_k(s)}{\sum_i \frac{1}{K} \sum_k \lambda_i(s) \times w_{ik} \times \Psi_k(s)}, \end{split} \tag{4}\] where \(w_{ik}\) is the weight of the interaction between pollinator species \(i\) and plant host \(k\), \(\Psi_k(s)\) is the occurrence probability of plant host \(k\) and \(\lambda_i(s)\) is the intensity of pollinator species \(i\).
We used the R-package vegan (Oksanen et al. 2022) to estimate the species richness, Shannon diversity and inverse Simpson index by passing the relative proportion Equation 4 into the function specnumber and diversity for species richness and both Shannon and Simpson index respectively.
Code
# Estimate diversity indices based on their taxon: either Bees, Butterflies or Hoverflies
# The function also takes into consideration whether you want to return the diversity estimate of a particular pollinator species
# If pollinatorSpeciesNames = NULL, then the function considers all the pollinator species within a zone or taxon.
# speciesForTaxon is the dataframe that contains the plant species prediction results. This is formatted from the fitted ISDMS to the ASO plant species
# zones is the zone we are interested in returning results for. It takes the values of either "all", "alpine", "boreal", "temperate"
# predRast is the prediction raster created in the masterScript
# silent indicates whether we want to print the species names or not
estimateZonalDiversity <- function(pollinatorSpeciesNames = NULL, #Names of the pollinator species (should be the simplescientific name)
speciesForTaxon, # The data with the plant species results
zones,
predRast, # The prediction raster
silent = FALSE){
# Check if the zones specified is what we expect
if(!zones %in% c("all", "alpine", "boreal", "temperate" )) stop("Taxa must either be 'all', 'alpine', 'boreal', 'temperate' ")
# Filter out the pollinator species we want
if(!zones %in% "all"){
dat <- speciesForTaxon%>%
dplyr::filter(zone %in% zones)%>%
na.omit()
} else if (zones == "all"){
dat <- speciesForTaxon%>%
dplyr::filter(zone %in% c("alpine", "boreal", "temperate"))%>%
na.omit()
}
returnAllPrediction <- is.null(pollinatorSpeciesNames )
if(is.null(pollinatorSpeciesNames )){
pollinatorSpeciesNames <- unique(dat$simpleScientificName)
} else {
dat <- dat %>%
dplyr::filter(simpleScientificName %in% pollinatorSpeciesNames)
}
if(!silent) print(pollinatorSpeciesNames)
#indSpeciesDiversity <- lapply(seq_along(pollinatorSpeciesNames), function(x){
sp <- pollinatorSpeciesNames#[x]
#get unique taxa
focalTaxa <- unique(dat$Taxon)
allZones <- c("alpine", "boreal", "temperate")
if(!zones %in% "all"){
#for now, we ensure that dat has only one pollinator group
if(length(unique(dat$zone))>1) stop("Select one zone only")
# Load the intensity of the species from the zone
spPred <- list()
for(j in 1:length(focalTaxa)){
path <- paste(c(dataFolder, "modelOutputs", "pollinators", zones, focalTaxa[j]), collapse = "/")
results <- readRDS(file.path(path, paste0("predictions", ".rds")))
results$intensity <- exp(results$mean)
# Transform the intensity to the original scale
diversity <- lapply(plantSpecies, function(plantSp){
divs <- results$intensity * speciesData[[plantSp]] * interractionProb[interractionProb$species %in% plantSp, dat$Taxon[j]]
#divs$overall <- app(divs, function(i) calculateRichness(i))
})|>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(plantSpecies)
# define geometries to combine with prediction
geometries <- xyFromCell(diversity, seq(ncell(diversity))) %>%
as.data.frame() %>%
st_as_sf(coords = c("x", "y"), crs = newCrs)
# Create the prediction data
PredictionData <- diversity %>%
as.data.frame(na.rm = FALSE)%>%
mutate(richness = vegan::specnumber(across(plantSpecies)),
simpson = vegan::diversity(across(plantSpecies), "invsimpson"),
shannon= vegan::diversity(across(plantSpecies)))%>%
#mutate(averageIntensity = sum(across(plantSpecies), na.rm = TRUE)/length(plantSpecies))%>%
reduce(cbind) %>%
bind_cols(geometries) %>%
st_sf()%>%
st_transform(., newCrs)%>% # transform the prediction data to the units used in model runs
filter(rowSums(is.na(.)) != (ncol(.)-1)) # Now we take all the rows with sum of NAs equal to the number of columns of the dataframe minus the
names(PredictionData) <- c(plantSpecies, "speciesNumber", "simpson", "shannon", "geometry")
PredictionData <- PredictionData%>%
mutate(averageIntensity = rowMeans(pick(all_of(plantSpecies))))%>%
select(plantSpecies, "speciesNumber", "simpson", "shannon","averageIntensity","geometry")
spPred[[j]] <- rasterize(PredictionData, predRast, names(PredictionData)[-length(names(PredictionData))])
}
names(spPred) <- focalTaxa
} else if(zones == "all"){
# Load the intensity of the species from the zone
spPred <- lapply(seq_along(allZones), function(x){
spPred <- list()
for(j in 1:length(focalTaxa)){
path <- paste(c(dataFolder, "modelOutputs", "pollinators", allZones[x], focalTaxa[j]), collapse = "/")
results <- readRDS(file.path(path, paste0("predictions", ".rds")))
results$intensity <- exp(results$mean)
# Transform the intensity to the original scale
diversity <- lapply(plantSpecies, function(plantSp){
divs <- results$intensity * speciesData[[plantSp]] * interractionProb[interractionProb$species %in% plantSp, dat$Taxon[j]]
#divs$overall <- app(divs, function(i) calculateRichness(i))
})|>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(plantSpecies)
# define geometries to combine with prediction
geometries <- xyFromCell(diversity, seq(ncell(diversity))) %>%
as.data.frame() %>%
st_as_sf(coords = c("x", "y"), crs = newCrs)
# Create the prediction data
PredictionData <- diversity %>%
as.data.frame(na.rm = FALSE)%>%
mutate(richness = vegan::specnumber(across(plantSpecies)),
simpson = vegan::diversity(across(plantSpecies), "invsimpson"),
shannon= vegan::diversity(across(plantSpecies)))%>%
#mutate(averageIntensity = sum(across(plantSpecies), na.rm = TRUE)/length(plantSpecies))%>%
reduce(cbind) %>%
bind_cols(geometries) %>%
st_sf()%>%
st_transform(., newCrs)%>% # transform the prediction data to the units used in model runs
filter(rowSums(is.na(.)) != (ncol(.)-1)) # Now we take all the rows with sum of NAs equal to the number of columns of the dataframe minus the
names(PredictionData) <- c(plantSpecies, "speciesNumber", "simpson", "shannon", "geometry")
PredictionData <- PredictionData%>%
mutate(averageIntensity = rowMeans(pick(all_of(plantSpecies))))%>%
select(plantSpecies, "speciesNumber", "simpson", "shannon","averageIntensity","geometry")
spPred[[j]] <- rasterize(PredictionData, predRast, names(PredictionData)[-length(names(PredictionData))])
}
names(spPred) <- focalTaxa
return(spPred)
})
names(spPred) <- allZones
}
# If we are interested in any particular pollinator species
if(!returnAllPrediction ){
return(spPred)
} else if(!zones %in% "all"){
# Find our how many species are in each group
groupCounts <- dat %>%
group_by(Taxon, zone)%>%
summarise(groupCounts = n())
# create a raster for the final results
#allSpPred <- spPred$Hoverflies$Ajuga_pyramidalis
# define geometries to combine with prediction
geometries <- crds(spPred$Hoverflies$Ajuga_pyramidalis)%>%
as.data.frame()%>%
st_as_sf(., coords = c("x", "y"), crs = newCrs)
allSpPred <- lapply(focalTaxa, function(x){
spPred[[x]]$averageIntensity * groupCounts[groupCounts$Taxon %in% x, 3]
})%>%
do.call("cbind", .)%>%
rowSums()
indSpeciesPred <- lapply(focalTaxa, function(x){
ret <- spPred[[x]]$averageIntensity%>%
as.data.frame(na.rm = TRUE)%>%
dplyr::mutate(intensity = averageIntensity/allSpPred) %>%
dplyr::select(intensity)%>%
replicate((as.numeric(unlist(groupCounts[groupCounts$Taxon %in% x, 3]) )), ., simplify = FALSE)%>%
do.call("cbind", .)
names(ret) <- dat$acceptedScientificName[dat$Taxon == x]
return(ret)
})%>%
do.call("cbind", .)%>%
as.data.frame()%>%
dplyr::mutate(richness = vegan::specnumber(across(dat$acceptedScientificName)),
simpson = vegan::diversity(across(dat$acceptedScientificName), "invsimpson"),
shannon= vegan::diversity(across(dat$acceptedScientificName)))%>%
dplyr::select(richness, shannon, simpson)%>%
bind_cols(geometries) %>%
st_sf()%>%
st_transform(., newCrs)%>% # transform the prediction data to the units used in model runs
filter(rowSums(is.na(.)) != (ncol(.)-1))
names(indSpeciesPred) <- c("richness", "shannon", "simpson","geometry")
spPred <- rasterize(indSpeciesPred, predRast, c("richness", "shannon", "simpson"))
return(spPred)
} else if(zones %in% "all"){
# Find our how many species are in each group
groupCounts <- dat %>%
group_by(Taxon, zone)%>%
summarise(groupCounts = n())
# create a raster for the final results
#allSpPred <- spPred$Hoverflies$Ajuga_pyramidalis
# define geometries to combine with prediction
geometries <- crds(spPred$alpine$Bees$Ajuga_pyramidalis)%>%
as.data.frame()%>%
st_as_sf(., coords = c("x", "y"), crs = newCrs)
allSpPred <- list()
for(i in seq_along(allZones)){
allSpPred[[i]] <- lapply(focalTaxa, function(x){
spPred[[allZones[i]]][[x]]$averageIntensity * groupCounts[groupCounts$Taxon %in% x, 3]
})%>%
do.call("cbind", .)%>%
rowSums()
}
names(allSpPred) <- allZones
indSpeciesPred <- list()
for(i in seq_along(allZones)){
indSpeciesPred[[i]] <- lapply(focalTaxa, function(x){
ret <- spPred[[i]][[x]]$averageIntensity%>%
as.data.frame(na.rm = TRUE)%>%
dplyr::mutate(intensity = averageIntensity/allSpPred[[i]]) %>%
dplyr::select(intensity)%>%
replicate((as.numeric(unlist(groupCounts[(groupCounts$Taxon %in% x) & (groupCounts$zone %in% allZones[i]), 3]) )), ., simplify = FALSE)%>%
do.call("cbind", .)
names(ret) <- dat$acceptedScientificName[dat$Taxon == x & dat$zone == allZones[i]]
return(ret)
})
}
names(indSpeciesPred) <- allZones
indSpeciesPred <- indSpeciesPred%>%
purrr::flatten() %>%
do.call("cbind", .)%>%
as.data.frame()%>%
dplyr::mutate(richness = vegan::specnumber(across(dat$acceptedScientificName)),
simpson = vegan::diversity(across(dat$acceptedScientificName), "invsimpson"),
shannon= vegan::diversity(across(dat$acceptedScientificName)))%>%
dplyr::select(richness, shannon, simpson)%>%
bind_cols(geometries) %>%
st_sf()%>%
st_transform(., newCrs)%>% # transform the prediction data to the units used in model runs
filter(rowSums(is.na(.)) != (ncol(.)-1))
names(indSpeciesPred) <- c("richness", "shannon", "simpson","geometry")
spPred <- rasterize(indSpeciesPred, predRast, c("richness", "shannon", "simpson"))
return(spPred)
}
}
# Load the functions needed to estimate the diversity
source("pipeline/richness/utils/estimateZonalDiversity.R")
source("pipeline/richness/utils/estimateTaxonDiversity.R")
# Getting diversity indices
if(!exists("biotraits")) source("pipeline/predictions/pollinatorZoneGroupings.R")
speciesForTaxon <- biotraits %>%
dplyr::select(acceptedScientificName, Taxon, zone)%>%
group_by(Taxon, zone)%>%
distinct()%>%
ungroup()%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
))%>%
na.omit()
# Load all the plant predictions result
if(!exists("interractionMatrix")) source("pipeline/webForInterractions/webPlotsForModels.R")
asoDatasf <- readRDS(file = paste0(dataFolder, "/plantDataFolder/formattedData/asoData.RDS"))%>%
mutate(
simpleScientificName = coalesce(
#redList$species[match(acceptedScientificName, redList$GBIFName)], # Match redList species
str_extract(acceptedScientificName, "^[A-Za-z]+\\s+[a-z]+") # Extract binomial name
)) %>%
dplyr::filter(simpleScientificName %in% c(interractionMatrix$species)) #
plantSpecies <- unique(asoDatasf$simpleScientificName)
plantSpecies <- sapply(plantSpecies, function(x){
rt <- stringr::str_split(x, " ")[[1]]
taxon <- paste(rt[1], rt[2], sep = "_")
return(taxon)
}) %>%
c()
plantSpecies <- as.character(plantSpecies)
# for(sp in species){
# print(sp)
# spPred <- allPlantPredictions[[sp]]%>%
# select("mean", "sd", "q0.025", "q0.5", "q0.975", "median", "Probabilities")#%>%
# #mutate(Probabilities = myinvcloglog(mean))
# #individualDatasetPreds[[which(species %in% sp)]] <- spPred
# spPred <- rasterize(spPred, predRast, c("mean", "sd", "q0.025", "q0.5", "q0.975", "median", "Probabilities"))
# path <- paste0(dataFolder, "/modelOutputs/plants/", sp)
#
# if(!file.exists(path)){
# dir.create(path)
# }
# saveRDS(spPred, file.path(path, paste0("predictions.rds")))
# }
##################################################
# Plant Species results
#########################################
speciesData <- lapply(paste0(dataFolder, "/modelOutputs/plants/", plantSpecies), function(x){
# try(list.files(x, pattern = paste0("predictions.rds"), recursive = TRUE, full.names = TRUE))
preds <- readRDS(list.files(x, pattern = paste0("predictions.rds"), recursive = TRUE, full.names = TRUE))
return(preds$Probabilities)
})|>
rast() |> # combine raster layers
#scale() |> # scale raster layers
setNames(plantSpecies)
#######################################################
# PlantData
#######################################################
#pollinatorSpeciesNames <- unique(speciesForTaxon$simpleScientificName)
# populate the species distributions
###############################################
# Diversity for alpine regions
##############################################
if(rerun){
# pollinatorSpeciesNames <- speciesForTaxon%>%
# dplyr::filter(zone == "alpine")%>%
# na.omit()%>%
# dplyr::select(simpleScientificName)%>%
# unique()%>%
# c()
######################################
# ZONAL DIVERSITY ESTIMATES
#####################################
zone <- c("alpine", "boreal", "temperate")
for(sp in zone){
spPred <- estimateZonalDiversity(pollinatorSpeciesNames = NULL,
speciesForTaxon = speciesForTaxon,
zones = sp,
predRast = predRast,
silent = FALSE)
path <- paste0(dataFolder, "/modelOutputs/diversity/", sp)
spPred <- rasterize(spPred, predRast, names(spPred))
if(!file.exists(path)){
dir.create(path)
}
saveRDS(spPred, file.path(path, paste0("predictions.rds")))
}
# Make predictions for all the zones
sp <- "all"
spPred <- estimateZonalDiversity(pollinatorSpeciesNames = NULL,
speciesForTaxon = speciesForTaxon,
zones = sp,
predRast = predRast,
silent = FALSE)
path <- paste0(dataFolder, "/modelOutputs/diversity/", sp)
spPred <- rasterize(spPred, predRast, names(spPred))
if(!file.exists(path)){
dir.create(path)
}
saveRDS(spPred, file.path(path, paste0("predictions.rds")))
rm("spPred")
######################################
# POLLINATOR TAXON DIVERSITY ESTIMATES
#####################################
taxa <- c("Butterflies", "Bees", "Hoverflies")
for(sp in taxa){
spPred <- estimateTaxonDiversity(pollinatorSpeciesNames = NULL,
speciesForTaxon = speciesForTaxon,
taxa = sp,
predRast = predRast,
silent = FALSE)
path <- paste0(dataFolder, "/modelOutputs/diversity/", sp)
spPred <- rasterize(spPred, predRast, names(spPred))
if(!file.exists(path)){
dir.create(path)
}
saveRDS(spPred, file.path(path, paste0("predictions.rds")))
rm("spPred")
}
# Make predictions for all the zones
sp <- "allTaxa"
spPred <- estimateTaxonDiversity(pollinatorSpeciesNames = NULL,
speciesForTaxon = speciesForTaxon,
taxa = sp,
predRast = predRast,
silent = FALSE)
path <- paste0(dataFolder, "/modelOutputs/diversity/", sp)
#spPred <- rasterize(spPred, predRast, names(spPred))
if(!file.exists(path)){
dir.create(path)
}
saveRDS(spPred, file.path(path, paste0("predictions.rds")))
} else {
# load(paste0(dataFolder, "/modelOutputs/diversity/alpineDiversity.RData"))
# load(paste0(dataFolder, "/modelOutputs/diversity/borealDiversity.RData"))
alpine <- readRDS("Data/modelOutputs/diversity/alpine/predictions.rds")
boreal <- readRDS("Data/modelOutputs/diversity/boreal/predictions.rds")
temperate <- readRDS("Data/modelOutputs/diversity/temperate/predictions.rds")
bees <- readRDS("Data/modelOutputs/diversity/Bees/predictions.rds")
butterflies <- readRDS("Data/modelOutputs/diversity/Butterflies/predictions.rds")
hoverflies <- readRDS("Data/modelOutputs/diversity/Hoverflies/predictions.rds")
}10. Results
10.1. Distribution of pollinators accross Norway
10.1.1. ISDM parameter estimates
We present the estimates of the climatic and trait variable covariates in Figure 16 and Figure 17. The pollinator abundance increases with trait latitude (mLat) and temperature (mBio1) as well as annual temperature (polyBio1 and polyBio2). However, the abundance decreases with an increasing latitude (Latitude). Keeping all other covariates contants, the results shows that we expect butterflies to be more abundant, followed by hoverflies and bees across Norway (see the intercept in Figure 16).
The pollinators favor built-up areas with reference to agro-forestry (Figure 17. Conversely, the intensity of the pollinators are lower at the other land cover categories with reference to the agro-foresty regions.
10.1.2. Validation of trait effects
The model selection shows that the ISDM fitted using the original traits is better than those with randomised trait covariates Table 11. This is because the Deviance Information criterion (DIC) and Watanabe-Akaike information criterion (waic) were smaller in the ISDM with original traits.
| Metric | Original traits | Randomised traits | Original traits Better |
|---|---|---|---|
| mlik | -1.107112e+06 | -1.109778e+06 | FALSE |
| waic | 3.881022e+05 | 4.245882e+05 | TRUE |
| dic | -2.155645e+00 | -1.857201e+00 | TRUE |
10.1.3. Spatial bias
We present the posterior mean of the bias field added to the presence-only data observation model (see \(\eta(s)\) described in Section 1.9.1) from the pollinator ISDM in Figure 18. This bias field describes the sampling, detection, reporting and other biases that are present in the presence-only data. We see a smooth spatial field as we have a more even spread of the pollinators accross the study region (see ?@fig-pollinatorPOandNSOS).
10.1.4. Prediction of pollinator distribution
We present the predictions of log-intensity of the pollinators in Figure 20. We see that the butterfliea are more abundant at Northern Norway, with the hoverflies and bees abundant at the southern region. This seems surprising as the abundance of all the pollinators decreases as the latitude increases and annual temperature increases Figure 16. Future iterations of this work will explore using a more finer mesh to fit the model and make thse predictions to ensure more stable and interpretable results.
Finally, we present the log-intensity of the alpine, boreal and temperate pollinators across Norway in Figure 20. On this scale, we do not see a clear difference between the various zones. The effect of the mBio1 and mLat are not large enough to cause a significant changes in the predicted zones (although on the original scale, we see some changes but not too significant).
10.2. Distribution of plant hosts
10.2.1 ISDM parameter estimates
We present the estimates of the climatic and trait variable covariates in Figure 21 and Figure 22. Generally, the plant host occurrence are higher at regions with hgh soil PH and annual mean temperature.
The pollinators favor built-up areas with reference to agro-forestry and other forest (Figure 22). Conversely, the intensity of the pollinators are lower at the other land cover categories with reference to the agro-foresty regions.
10.2.2. Spatial bias
We present the posterior mean of the bias field added to the presence-only data observation model (see \(\eta(s)\) described in Section 1.9.1) from the plant hosts ISDM in Figure 23. This bias field describes the sampling and/or reporting biases that are present in the presence-only data. We see that regions with presence-only records have higher intensity than regions with low presence occurrence records(see Figure 9).
10.2.3 Plant host occurrence probability
We present the posterior mean of the plant host occurrence probability across Norway in Figure 24. Some species (such as Veronica officinalis and Filipendula ulmaria) have relatively higher occurrence probabilities across the region whilst others (such as Hieracium murorum) have relatively smaller occurrence probabilities.
10.3. Alpha diversity estimates
10.3.1 Diversity at ASO meadows
Although we estimated species richness and Shannon inices, We present the inverse Simpson indices for the alpine Figure 25, boreal Figure 26 and temperate Figure 27 pollinators at the ASO meadows. We group these estimates based on the value of the annual mean temperature (bio1) and latitude (see Section 1.4.2.1 and Section 1.9.5 for details).
It can observed that there are no ASO meadow locations within negative annual temperature and positive latitude zones (Figure 25, Figure 26, Figure 27). Regions with high temperature (positive regions) are more diverse than regions with low temperature. Additionally, meadows at lower latitudes with colder temperatures are the least diverse.
It is obvious that diversity increases with the species richness and eveness. The species richness across all meadows for each pollinator group (alpine, boreal and temperate pollinators) were all the same, and so the diversity within a meadow reflects how evenly species are distributed within the meadow.
We also present the Shannon-Weiner and inverse Simpson diversity indices for the alpine, boreal and temperate pollinators in Figure 28. Both indices tell the same story: there are more diverse temperate pollinators, than boreal pollinators and alpine pollinators. When we put all the zones together, we see that the diversity is higher at regions with low altitude and high latitude (last column in Figure 28 and compare with column called annual temperature in Figure 11).
10.3.2. Diversity across Norway
Finally, the diversity of bees, butterflies and hoverflies across the study region are presented in the last row in Figure 28. We see that bees and butterflies have similar diversity distribution, with hoverflies being more diverse than both of them. However, all the pollinators are more diverse at regions with low latitude and high mean temperatures (a result consistent within this chapter).
10.4. Pollinator indicators
The reader is referred to Section 1.4 for the results.